Perl help needed

SpomaMewor · Mar 6, 2003

I am writing a little website and I have to grab a certian piece of text out of another .htm file and insert it into the current page that is being displayed. I am able to open the file and save it into an array but I am having trouble figuring out how to search for the certain text. I have to search through the this file and find a certian piece of text which is unique and then delete from it on out. For example if I was searching this

asdasdasdxasdasd
dsfsdfsdfsdfxsdfsdf

I would need to search for the x and delete everything from the x and right. does this make sense?
I know how to grab the array line so I will be working iwth single lines of code.

any help would be greatly appreciated.

Juniper · Mar 6, 2003

Does your script grab the source of the website or only the output. I mean, does it also grab comments? You could insert some special comments and have your script look for those particular comments and delete stuff in between.

SpomaMewor · Mar 6, 2003

the problem is that the page i am searching through is generated by a program and i can not change its output but rather i know that the output wil always be in the same form. i am trying to grab two tables out of this form and put them in my page.

stndn · Mar 6, 2003

so, let's say you have these lines...
ababababababacababababa
kjkjkjkjkjpkjkjkjkjkj
defdefdefdefudefdefdef

then you want to be left with
ababababababa
kjkjkjkjkj
defdefdefdef
??

or basically, do you want to get two table entries from below (based on your last comment)??
<table>
<tr>
<td>first table</td>
<td>second table</td>
<td>third table, grab this</td>
<td>fourth table, grab this</td>
<td>fifth and beyond, disregard</td>
</tr>
</table>

let's just say i'm too confused to help for now .....

notfred · Mar 6, 2003

# loop. if you don't understand this, pay someone else to do this.
while( <FILE> ){

# your line should have an x in it.
# you want to remove everything from (and including) the x, to the end of the line.
$_ =~ s/^(.*?)x.*$/$1/;

# Now you wanted to push that info onto an array...
push @array, $_;

# Close the loop.
}

SpomaMewor · Mar 6, 2003

thanks for the help guys. i am far from an expert at perl and in fact i am fairly new but everything i do i like to understand and do myself and this is not a job where i am going to pay anyone to do it for me. i was mearley asking for a little help. i figured out this way and it appears to work for me.

print (substr($array_nabl[18], 0, index($array_nabl[18], "<td class=s3")));

notfred · Mar 6, 2003

"If you don't understand this..." was only meant to apply to the line:
while( <FILE> ){

I could definitely understand if you didn't get:
$_ =~ s/^(.*?)x.*$/$1/;

Agamar · Mar 6, 2003

I would just like to note that NotFred has an insane amount of posts... Do you leave the keyboard?

notfred · Mar 6, 2003

Originally posted by: Agamar
I would just like to note that NotFred has an insane amount of posts... Do you leave the keyboard?

Only to move over to another keyboard at a different computer.... I'm a professional web developer and part time computer science student....

SpomaMewor · Mar 7, 2003

thanks for the help notfred.
i am opening three other html pages using the idea of opening the file copying to an array then closing it, then i do searching and stuff from there. for the main page there are three files that i pull in stuff from. i am using htis code to open, copy and close:

$nabl_title_file="FILENAME";

open(nabl_title, $nabl_title_file) || die("Could not open file!");
@array_nabl=<nabl_title>;
close(nabl_title);

foreach $nabl_type (@array_nabl)
{
print "$nabl_type";
}

it seems to be slowing down. can you point me to a place to find a more effecient way of doing this or is this a delay i will just have to deal with?

Descartes · Mar 7, 2003

Originally posted by: SpomaMewor
thanks for the help notfred.
i am opening three other html pages using the idea of opening the file copying to an array then closing it, then i do searching and stuff from there. for the main page there are three files that i pull in stuff from. i am using htis code to open, copy and close:

$nabl_title_file="FILENAME";

open(nabl_title, $nabl_title_file) || die("Could not open file!");
@array_nabl=<nabl_title>;
close(nabl_title);

foreach $nabl_type (@array_nabl)
{
print "$nabl_type";
}

it seems to be slowing down. can you point me to a place to find a more effecient way of doing this or is this a delay i will just have to deal with?

Is there any particular reason you want it in an array? Would you rather just have the entire contents of the file in a scalar? I would do the following:

$nabl_title_file="FILENAME";

my $nabl_file_contents;
{
open(nabl_title, $nabl_title_file) || die("Could not open file!");
local $/;
$nabl_file_contents = <nabl_title>;
close(nabl_title);
}

Now $nabl_file_contents (tried to stick w/ your naming convention) contains the entire file. You of course don't need to local $/ to the scope like that, but I do it to make it more explicit. If this isn't what you want, ignore me.

notfred · Mar 7, 2003

Originally posted by: SpomaMewor
thanks for the help notfred.
i am opening three other html pages using the idea of opening the file copying to an array then closing it, then i do searching and stuff from there. for the main page there are three files that i pull in stuff from. i am using htis code to open, copy and close:

$nabl_title_file="FILENAME";

open(nabl_title, $nabl_title_file) || die("Could not open file!");
@array_nabl=<nabl_title>;
close(nabl_title);

foreach $nabl_type (@array_nabl)
{
print "$nabl_type";
}

it seems to be slowing down. can you point me to a place to find a more effecient way of doing this or is this a delay i will just have to deal with?

Typically, you use all caps for filehandles. I rewrote the above code for you.

open NABL_TITLE, $nabl_title_file || die("Could not open file!");
while ( <NABL_TITLE> ){print};
close NABL_TITLE;

SpomaMewor · Mar 7, 2003

i appreciate both of the methods you gusy showed me. i am going to test them both out and see what i get for results. i am using this in a cgi file. will one give me better performance over another.

notfred
remember i am new to the perl language but is it going to cause a probllem if multiple access the page at the same time where you are printing the file while it is still open?

stndn · Mar 7, 2003

if you are opening the file for reading, it doesn't matter how many people opens the file all at once
however, if you are reading the file and someone else is writing to it at the same time, you will get into some problems
if you want to be sure, you can always use file locking with flock

i guess what you want is open the files, substitute, and print the contents?
or just simply print what you read, based on your post at 03/07/2003 7:58 AM ?

the code that Descartes posted is opening files, assign it to a scalar (variable), close the file, and print the contents.
the code that notfred posted is opening the file, read and print each line contents of the file, then close the file.

if you are working with multiple files, the code below will go through each of your files in @fileList array, open it, print the contents of the file, close the file, and move to next one ...

------

my $singleFile;
my @fileList = ("file1", "file2", "file3");
foreach $singleFile (@fileList)
{
open (INFILE, "$singleFile") || die "Cannot open $singleFile\n";
print <INFILE>;
close INFILE;
}

--------

of course, if you want to do substitutions or assign to array, etc, you can change print INFILE to something else...

as far as performance goes, i think notfred's code will be slightly faster, since it deals with less variable and stuffs... but then again, at this small scale, i'd say it's quite negligible

SpomaMewor · Mar 7, 2003

i will not be writing to any of the files so that is not a concern.
what happens is there is a program that creates a bunch of html files. i am opening and printing these files in different situation so that i am able to display them within my webpage.
in some situations i have to remove the first 8 lines from these pages to remove all the html header information because it is already contained earlier in the page. so i was using this:
splice (@array_NABLBODY,0,7);
then i would print after removing this lines.

my major question is that with the way i was opening files i was seeing a delay as i opened three files to print context. will copying to a scalar instead of an array save me time in processing?

Descartes · Mar 7, 2003

Originally posted by: SpomaMewor
i will not be writing to any of the files so that is not a concern.
what happens is there is a program that creates a bunch of html files. i am opening and printing these files in different situation so that i am able to display them within my webpage.
in some situations i have to remove the first 8 lines from these pages to remove all the html header information because it is already contained earlier in the page. so i was using this:
splice (@array_NABLBODY,0,7);
then i would print after removing this lines.

my major question is that with the way i was opening files i was seeing a delay as i opened three files to print context. will copying to a scalar instead of an array save me time in processing?

As stndn said, the time differential will be almost negligible; however, reading the entire contents of the file by undefing the record delimiter will almost certainly be faster than reading it into an array. I did a quick search and couldn't find a benchmark, so I may write one later just to see.

You could also try the File::Slurp module. It probably won't be any faster, but it encapsulates both methods discussed in this thread (entire contents into a scalar, or entire contents into an array).

Perl help needed

SpomaMewor

Member

Juniper

Platinum Member

SpomaMewor

Member

stndn

Golden Member

notfred

Lifer

SpomaMewor

Member

notfred

Lifer

Agamar

Golden Member

notfred

Lifer

SpomaMewor

Member

Descartes

Lifer

notfred

Lifer

SpomaMewor

Member

stndn

Golden Member

SpomaMewor

Member

Descartes

Lifer

TRENDING THREADS