Old Perl programmer, new to Ruby. Is there a method/module out there for
reading a file backwards? This would be useful for logfiles, for example.
Thanks in advance.
Dan
Hi Dan,
perhaps something like this is what you want?
synack@eugene:~/tmp/fox-0.99.166/textedit$ cat > testFile
line 1
line 2
line 3
synack@eugene:~/tmp/fox-0.99.166/textedit$ irb
irb(main):001:0> f = File.new("testFile")
#<File:0x402281d4>
irb(main):002:0> f.readlines.reverse.each { |line| p line }
"line 3\n"
"line 2\n"
"line 1\n"
["line 3\n", "line 2\n", "line 1\n"]
Hope this helps you some.
- --
Signed,
Holden Glova
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.4 (GNU/Linux)
Comment: For info see http://www.gnupg.org
iD8DBQE6vZ4czrxa+Gy/b/4RAuzxAJ9FDBHhSliQ53y4ucDcnUrvd8655wCgmp1j
TM8/q0Njq/WKOrBh1piFN78=
=mY0s
-----END PGP SIGNATURE-----
Ah, crud - I forgot to mention that I *do not* want to read the entire file
into memory, which the readlines method appears to do (unless I'm mistaken).
I also would like to avoid tricks like creating temporary files.
Thanks again.
Dan
- -----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
At 16:50 3/25/2001 +0900, you wrote:
- - -----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Sun, 25 Mar 2001 14:10, I think Daniel Berger wrote:
> Hi all,
>
> Old Perl programmer, new to Ruby. Is there a method/module out there for
> reading a file backwards? This would be useful for logfiles, for example.
>
> Thanks in advance.
>
> Dan
Hi Dan,
perhaps something like this is what you want?
synack@eugene:~/tmp/fox-0.99.166/textedit$ cat > testFile
line 1
line 2
line 3
synack@eugene:~/tmp/fox-0.99.166/textedit$ irb
irb(main):001:0> f = File.new("testFile")
#<File:0x402281d4>
irb(main):002:0> f.readlines.reverse.each { |line| p line }
"line 3\n"
"line 2\n"
"line 1\n"
["line 3\n", "line 2\n", "line 1\n"]
Here's a little ruby script I use for watching a log file
if(!ARGV[0] || !ARGV[1])
puts "usage ruby tailReverse.rb Filename bytesToRead [interval in secs]"
exit
end
x=ARGV[0]
y=ARGV[1]
z=ARGV[2]
tint=60
if !z
tint=60
else
tint=z.to_i
end
while tint > 0 do
f=File.new(x)
s=f.stat
if s.size <= y.to_i
f.each {|line| puts "#{line}"}
else
f.seek(-y.to_i, IO::SEEK_END)
f.readlines.reverse.each {|line| puts "#{line}"}
end
f.close
puts "--------------------------------------------------"
sleep(tint)
end
-----BEGIN PGP SIGNATURE-----
Version: PGP for Personal Privacy 5.5.2
iQA/AwUBOr4PKNjbcG3JH9DeEQIGsACgpdYtW2+UJWVbroIezgX1O63A9mAAoJYs
/ZXVB3id/VtJz0ZlWyQfwveK
=dQJr
-----END PGP SIGNATURE-----
If you're working under Unix, "tail -n" would be a simple solution.
def last_lines(file, nr_lines)
`tail -n #{nr_lines} #{file}`
end
last_lines("/var/log/messages", 5)
--
Michael Neumann
tell me whether this works for you:
class IO
def reverse_each(&proc)
seek(0,SEEK_END)
i = tell
j = nil
buf = ""
loop do
k = buf.rindex(10,-2)
if k then
proc.call buf.slice!(k+1..-1)
else
if i==0 then proc.call(buf); return self; end
j,i = i,[0,i-4096].max
seek(i,SEEK_SET)
buf = read(j-i) + buf
end
end
end
end
f = File.open("/home/matju/rsa_edit.htm","r")
f.reverse_each {|x| print x }
=============================
matju
> Ah, crud - I forgot to mention that I *do not* want to read the
> entire file into memory, which the readlines method appears to do
> (unless I'm mistaken). I also would like to avoid tricks like
> creating temporary files.
Here's the logic (but not the ruby code - I'm not that familiar with the
language) for a two-pass approach:
Scan the file, building an array of file positions and lengths for
each line (an array of tuples)
Start at the back of the array and move toward the front
For each position in the array:
Seek to the file position
Read the line
Process the line
Here's the logic for a somewhat more complicated, but possibly more
efficient one-pass approach:
Seek to the end of the file
Decide on a block size you can afford to keep in memory (you may
have to keep more in memory if you run into strings longer
than your blocksize or which straddle blocks)
Back up the distance equal to this block size
Read a block
Scan backwards from the end of the block, skipping past the line-end
character (or pair, depending on your platform) at the end of the
block
For each line [see note below]:
Extract the substring for the line
Process the line
Save any leftover partial line at the beginning of the block
While more data:
Back up to the position of the prior block (or BOF if partial block)
Read the block
Append any leftover from earlier processing
Scan backwards from the end of the block
For each line in the block:
Extract the substring for the line
Process the line
Save any leftover partial line at the beginning of the block
If any leftover:
process the leftover (which represents the first line in the file)
Recognition of a line in a block when you are using the second method
occurs when you encounter the line-termination character (or pair,
depending on your platform). Extract the data following the
line-termination character(s) as the substring for the line.
For both algorithms, you need to open the file in binary mode, to avoid
getting bogus results when you ask for or specify file positions.
Handling all the line-end combinations can be tricky, because different
platforms use different conventions, and the real world will hand you
files that present some anomalies. For example, Macintosh uses CR
(ASCII 13) as the line-end character. UNIX uses LF (ASCII 10). VAX,
CPM, DOS, and Windows all use CR followed by LF. One too-common anomaly
is the presence of extra CR characters preceding the CR+LF pair. Until
you see the LF you'll be tempted to think you're dealing with a Mac
file. One way of dealing with this is to read from the beginning of the
file until you have seen at least one CR and/or LF character followed by
at least one non-CR/LF. If only CR was seen, set an eol variable to
"\r"; if only LF was seen, set the variable to "\n"; if a CR was
followed by LF, set the variable to "\r\n"; if both were seen but the
order was LF followed by CR, you might set the variable to "\n\r"
(though this isn't a legitimate combo on any platform I know of). Then
back up and use your variable to recognize line-ends canonical to the
platform you've decided you're handling, treating the anomolous strays
as part of the line itself. For this peek-ahead in which you attempt to
determine which platform you're dealing with, you also have to handle
the border condition of a file which consists only of CR and LF
characters.
There's a third approach which is a variation on the first, and which
avoids all the hand-wringing over end-of-line conventions by letting the
underlying I/O libraries handle the problem:
Open the file in text mode
Set pos variable to zero
While more lines:
Read and discard the line
Append value of pos value to array
Store the current position of the file in pos
Process the lines using the position array as in the first option
In this case you're paying the price of a small amount of additional
overhead (making a copy of each line in the first pass) for a measurable
increase in simplicity, robustness, and portability. And you don't need
to store line lengths in this version, either. This third approach is
the one I recommend.
Hope this helps.
[PS: I dropped ruby...@netlab.co.jp from the address list; I don't
know what the conventions are for the ruby mailing list, but I didn't
want to fall into inadvertant cross-posting, which usually invites
flames.]
--
Bob Kline
mailto:bkl...@rksystems.com
http://www.rksystems.com
Dan,
The following should do what you want (and handle at least some of the
pathologic conditions (e.g. linelength > buffersize)
Regards, Paul
# irb -r rev --prompt xmp
r = ReverseFile.new File.new '/var/log/messages'
==>#<ReverseFile io=#<File:0x81e46f4>>
r.reverseReadline
==>"Mar 25 21:37:18 norbert dhclient: DHCPREQUEST on ex0 to 192.168.1.1
port 67"
r.reverseReadline
==>"Mar 25 21:12:16 norbert dhclient: bound to 192.168.1.104 -- renewal
in 1502 seconds."
count = 0
==>0
r.reverseReadlines { count += 1 }
==>nil
count
==>257
===== rev.rb =====
class ReverseFile
def initialize(io,readback=8000)
raise RuntimeError, "readback must be >1" unless readback > 1
@io = io
@readback = readback
@fpos = @io.stat.size
@linebuf = readbuf
return self
end
def reverseReadline
@linebuf = readbuf if @linebuf == []
if block_given?
yield @linebuf.pop
else
return @linebuf.pop
end
end
def reverseReadlines
lins = []
while (line=reverseReadline) != nil
if block_given?
yield line
else
lines << line
end
end
return lines unless block_given?
end
def inspect
"#<ReverseFile io=#{@io}>"
end
def readbuf
seekpos = (@fpos-@readback)>0 ? (@fpos-@readback) : 0
@io.seek seekpos
newfpos = @io.pos
lines = @io.read(@fpos-newfpos).split $/
if newfpos == 0
@fpos = 0
return lines
elsif lines.length <= 1
@readback = @readback * 2
return readbuf
else
firstline = lines.shift
@fpos = newfpos + firstline.length
return lines
end
end
private :readbuf
end
> def reverseReadlines
> lins = []
This should of course be 'lines = []' - looks like the 'e' got
lost in the copy/paste
Paul
Worked like a charm. I don't know ruby well enough yet to completely
understand this code (the codeblock & proc.call are throwing me at the
moment) but I take it that you're reading in a max. of 4k per line. Is that
correct?
Thanks!
Dan
Well, I'm reading a max of 4k at a time; there is no limit on the length
of a line; the buffer grows bigger until the program knows that it has a
full line.
For very, very long lines, though, the program will become slower. I think
that replacing i-4096 by i-4096-buf.length can do some good, but it also
can do some bad and i don't know how the two would balance. There are of
course a lot of other possible improvements.
matju
Ah, I see. Thanks.
>
> For very, very long lines, though, the program will become slower. I think
> that replacing i-4096 by i-4096-buf.length can do some good, but it also
> can do some bad and i don't know how the two would balance. There are of
> course a lot of other possible improvements.
>
> matju
>
By all means, add them. What would it take to propose an addition to the
API? Since reading log files is a very common activity, I think it would be
welcome.
Thanks again.
Dan
At 23:35 3/28/2001 +0900, you wrote:
>That old nemesis that makes everyone go to work and hour early is rapidly
>approaching. Just for grins, I tried this. I'm running Windows 2000 and
>have my clock set to automatically adjust for DST.
>
>t=Time.local(2001, 4, 1, 2,0)
>puts t
>
>and saw this.
>
>Sun Apr 01 01:00:00 Eastern Standard Time 2001
>
>This can't be right. It's 03:00 EDT (local time). 02:00 EDT(local
>time) doesn't exist by definition.
>
>Having seen this I tried this.
>
>t=Time.local(2001,10,7,1,59,59)
>
>and saw this
>
>Sun Oct 07 02:59:59 Eastern Daylight Time 2001
>
>Now we know that there are two 01:59:59 seconds on the day we get that
>hour back. But neither one of them is the time Ruby reports. In fact the
>time Ruby reports for Oct 7, 2001 doesn't exist by definition.
>
>I also tried this on an NT 4.0 box and had the same experience.
Hey I screwed up with the second example.
t=Time.local(2001,10,28,1,59,59)
puts t
Sun Oct 28 01:59:59 Eastern Daylight Time 2001
Which is correct for correct for local time
> That old nemesis that makes everyone go to work and hour early is rapidly
> approaching. Just for grins, I tried this. I'm running Windows 2000 and
> have my clock set to automatically adjust for DST.
Have you tried any othre applications. There is a known bug with the
Windows libraries reporting the wrong time for the first week in
April.
Dave
I don't know if that's what you're talking about, but afaik the period
between the last Sunday of March and first Sunday of April is Daylight
time in Europe and Standard time in America.
This may be related to the strange "GMT (London, etc.)" line you can
find in Windows' time configurator. See also ruby-talk:6542
matju
> On Wed, 28 Mar 2001, Dave Thomas wrote:
> > Ernest Ellingson <er...@powernav.com> writes:
> > > That old nemesis that makes everyone go to work and hour early is rapidly
> > > approaching. Just for grins, I tried this. I'm running Windows 2000 and
> > > have my clock set to automatically adjust for DST.
> > Have you tried any othre applications. There is a known bug with the
> > Windows libraries reporting the wrong time for the first week in
> > April.
>
> I don't know if that's what you're talking about, but afaik the period
> between the last Sunday of March and first Sunday of April is Daylight
> time in Europe and Standard time in America.
Apparently this is a bug:
http://www.zdnet.com/zdnn/stories/news/0,4586,2186402,00.html
Dave
It actually is DST at that time. If isdst is false why does 02:00 show up
as 01:00 EST?
Ernie
Ernie
% uname -a && ruby -e 'p Time.local(2001, 4, 1, 2, 0)' && ruby -v
Linux vilya 2.4.2 #6 Fri Mar 16 14:12:57 EST 2001 i686 unknown
Sun Apr 01 03:00:00 EDT 2001
ruby 1.6.2 (2001-01-18) [i686-linux]
% uname -a && ruby -e 'p Time.local(2001, 4, 1, 2, 0)' && ruby -v
OpenBSD narya 2.7 GENERIC#25 i386
Sun Apr 01 03:00:00 EDT 2001
ruby 1.6.2 (2001-01-18) [i386-openbsd2.7]
% uname -a && ruby -e 'p Time.local(2001, 4, 1, 2, 0)' && ruby -v
SunOS laplace.stat.cwru.edu 5.7 Generic_106541-12 sun4u sparc
Sun Apr 01 01:00:00 EST 2001
ruby 1.6.3 (2001-02-24) [sparc-solaris2.7]
% uname -a && ruby -e 'p Time.local(2001, 4, 1, 2, 0)' && ruby -v
AIX theory2 2 4 000069614600 unknown
Sun Apr 01 01:00:00 EST 2001
ruby 1.6.2 (2001-01-18) [powerpc-aix4.2.0.0.0]
--
ruby -F- -e '$, = $; . sub /./, " Another "; print %w.Just Rewbie .. join'
In message "[ruby-talk:13279] Re: Time Travel with Ruby"
on 01/03/29, Ernest Ellingson <er...@powernav.com> writes:
|Well you can blame it on Windows but has anyone tried this on other systems.
|t=Time.local(2001, 4,1,2,0)
|puts t
This prints "Sun Apr 01 03:00:00 EDT 2001" on my Linux box.
Since my timezone does not have DST, I tried on your timezone.
FYI, Ruby had a bug in DST boundary on versions prior to 2000-07-20.
matz.
The inventors of DST only hoped to get the whole world to start work an
hour early during the days with long daylight hours. I'm sure they weren't
aware of all the un intended consequences.
Ernie
Of course it messes up -- April Fools!
First and most important:
Thanks a lot for this very interesting programming language!
Unfortunately Time.local also behaves strangely on my W95 notebook:
Time.local on an unpatched ruby-1.6.2 is systematically 1 hour off
during DST on my system. In Switzerland as in other european countries
we switched to DST on 2000-03-25 02:00:00, so we lost the hour from
02:00 to 02:59.
puts Time.local(2001, 3, 25, 1, 59, 59)
==> Sun Mar 25 00:59:59 GMT+1:00 2001 # wrong
puts Time.local(2001, 3, 25, 2, 0, 0)
==> Sun Mar 25 03:00:00 GMT+1:00 2001 # dubious
puts Time.local(2001, 3, 25, 3, 0, 0)
==> Sun Mar 25 04:00:00 GMT+1:00 2001 # wrong
We will switch back from DST on 2000-10-28 02:00:00
puts Time.local(2001, 10, 28, 0, 0, 0)
==> Sun Oct 28 01:00:00 GMT+1:00 2001 # still wrong
puts Time.local(2001, 10, 28, 1, 0, 0)
==> Sun Oct 28 01:00:00 GMT+1:00 2001 # strangely enough this is
# correct the hour from
# 01:00 to 02:00 on
# 2001-10-28 is the only hour
# where Ruby gets the correct
# time during DST
puts Time.local(2001, 10, 28, 2, 0, 0)
==> Sun Oct 28 02:00:00 GMT+1:00 2001 # correct
the last sunday in march (when we will switch to DST again) in 2002 is
2002-03-31
puts Time.local(2002, 3, 31, 0, 59, 59)
==> Sun Mar 31 00:59:59 GMT+1:00 2002 # correct
puts Time.local(2002, 3, 31, 1, 0, 0)
==> Sun Mar 31 00:00:00 GMT+1:00 2002 # wrong
etc.
(as you see Cygwin on W95 doesn't display correct TimeZone
descriptions during DST, but i didn't check what would be needed to
change Cygwin to use the TZ descriptions from Windows. The routine
make_time_t from time.c displays the same strange behavior if it is
compiled with VC6 which requires the replacement of gettimeofday by a
simple call to time(NULL))
I applied the following patch to time.c to fix the strange behavior of
ruby-1.6.2's Time.local on my W95 box.
I have no idea if this patch is also working on other systems.
--%<------------------------------------------------------
*** time.c Fri Dec 22 03:22:04 2000
--- time.c.new Thu Mar 29 22:35:54 2001
*************** make_time_t(tptr, utc_or_local)
*** 357,365 ****
}
tm = localtime(&guess);
if (!tm) goto error;
! if (lt.tm_isdst != tm->tm_isdst) {
! guess -= 3600;
}
#endif
if (guess < 0) {
goto out_of_range;
--- 357,368 ----
}
tm = localtime(&guess);
if (!tm) goto error;
! if (tm->tm_isdst) {
! if (lt.tm_isdst == tm->tm_isdst || !lt.tm_isdst) {
! guess -= 3600;
! }
}
+
#endif
if (guess < 0) {
goto out_of_range;
--%<------------------------------------------------------
I used this script to check my patch. Store it as e.g. timelocal.rb
and call it as
ruby timelocal.rb year month day hours minutes seconds
e.g.
ruby timelocal.rb 2001 3 25 3 0 0
--%<------------------------------------------------------
def timegm(*args)
args.push(1)
mk_time(*args)
end
def timelocal(*args)
mk_time(*args)
end
def mk_time(year, month, day, hours=0, minutes=0, seconds=0,
gm_flag=0)
below_secs = 0
above_secs = (2 ** 31 - 1).to_i
secs = (below_secs + above_secs) / 2
inputs = [year, month, day, hours, minutes, seconds]
tests = []
compare = 0
# counter = 0
while (below_secs <= above_secs)
# counter += 1
secs = (below_secs + above_secs) / 2
if (gm_flag == 0)
tests = ((Time.at(secs).to_a)[0..5]).reverse
else
tests = ((Time.at(secs).gmtime.to_a)[0..5]).reverse
end
compare = inputs <=> tests
if (compare == 0)
# puts "found after #{counter} iterations"
return secs
elsif (compare < 0)
above_secs = secs - 1
elsif (compare > 0)
below_secs = secs + 1
else
raise "something terrible"
end
end
raise "date non-existant #{inputs.join(', ')}"
end
if __FILE__ == $0
args = ARGV.map {|i| i.to_i}
t = timelocal(*args)
print "timelocal: "
puts Time.at(t)
t = timegm(*args)
print "timegm: "
puts Time.at(t).utc
t = Time.local(*args).to_i
print "Time.local: "
puts Time.at(t)
t = Time.gm(*args).to_i
print "Time.gm: "
puts Time.at(t).utc
end
--%<------------------------------------------------------
Okay. I charge 70$/hour for consulting work.
:-)
> What would it take to propose an addition to the API?
A good idea, relevant to said API, and an email to this list.
> Since reading log files is a very common activity,
It is common, but it is not *very* common. Thus it may belong in the RAA,
but not in ruby.tar.gz, if you see what I mean.
matju
In message "[ruby-talk:13348] Re: Time Travel with Ruby"
on 01/03/30, "Samuel Kilchenmann" <skil...@swissonline.ch> writes:
|Unfortunately Time.local also behaves strangely on my W95 notebook:
I verified a bug you reported. It will be fixed very soon.
Thank you.
matz.