Read text file by line using Node.js

How do I access document rows when only data blocks can be read?

The Node file system library provides many functions for reading and writing a file. You can load the entire file into a character string. Lines of text can be stored in a file one by one. But what cannot be done is to read the file line by line. When it is very large or when you want to restrict reading to a number of lines, this space becomes annoying.

Several solutions to this problem have been proposed, with or without an optional component, but none of them are satisfactory if you want to reproduce PHP or C functions or any other language: open a file, read a line in a loop, close a file.

Here is a fairly simple algorithm that leads to this result, which does not require an external module.

  1. An associative array fileBuffer is created with the key resource for the file, and by value - an array containing the read strings .
  2. In
  3. The filePtr table also has a file resource, and the value is the position in the file .
  4. Checks if the table contains content. If yes, the first feature is read and suppressed with shift.
  5. Otherwise, a block of data in the file is read, starting at pos and measuring 4096 bytes.
  6. The actual number of bytes read returned to br.
  7. If this number is less than 4096, there will be no other reading, delete the file entry in filePtr (this will be used by the eof function).
  8. The buffer is converted to a string and divided into an array assigned to the fileBuffer [handle].
  9. The last element of the table is erased because in most cases the row is truncated.
  10. The next position in the file is determined by adding the number of bytes read minus the size of the last element.
  11. When the array is empty, we start with 4, unless the end of the file is found.

Module source code:

var fs = require('fs');
var filePtr = {}
var fileBuffer = {}
var buffer = new Buffer(4096)

exports.fopen = function(path, mode) {
  var handle = fs.openSync(path, mode)
  filePtr[handle] = 0
  fileBuffer[handle]= []
  return handle
}

exports.fclose = function(handle) {
  fs.closeSync(handle)
  if (handle in filePtr) {
    delete filePtr[handle]
    delete fileBuffer[handle]
  } 
  return
}

exports.fgets = function(handle) { 
  if(fileBuffer[handle].length == 0)
  {
    var pos = filePtr[handle]
    var br = fs.readSync(handle, buffer, 0, 4096, pos)
    if(br < 4096) {
      delete filePtr[handle]
      if(br == 0)  return false
    }
    var lst = buffer.slice(0, br).toString().split("\n")
    var minus = 0
    if(lst.length > 1) {
      var x = lst.pop()
      minus = x.length
    } 
    fileBuffer[handle] = lst 
    filePtr[handle] = pos + br - minus
  }
  return fileBuffer[handle].shift()
}

exports.eof = function(handle) {
  return (handle in filePtr) == false && (fileBuffer[handle].length == 0) 
}

You can import the module, as in the example below, or directly integrate functions into its project by removing the exports prefix.

Demo Source Code

In this example, the file is read by line and written to the new file by line, but using File System functions.

var fs = require('fs')
var readline = require("/scripts/node-readline/node-readline.js")

var source="/scripts/node-readline/demosrc.html"
var target="/scripts/node-readline/demotgt.html"

var r=readline.fopen(source,"r")
if(r===false)
{
   console.log("Error, can't open ", source)
   process.exit(1)
} 

var w = fs.openSync(target,"w")
var count=0
do
{
   var line=readline.fgets(r)
   console.log(line)
   fs.writeSync(w, line + "\n", null, 'utf8')
   count+=1
}
while (!readline.eof(r))
readline.fclose(r)
fs.closeSync(w)

console.log(count, " lines read.")

Replace the name of the source and target files with what you want. You can also adapt the buffer size to your needs. If the file contains lines longer than 4096 bytes (this is quite rare when you have to read it in lines), then the buffer must be increased proportionally .

Download full source code.