Method for extracting images systematically from a website

ZiKi

Senior member
Mar 19, 2004
417
0
0
Hey I'm not sure if this the right place to ask this. I'm working on a new shopping cart for a web store and I need to extract all the images from the old store as they appear on the page and rename them chronologically in the order which they appear. For instance I have 9 items on one page. Each items picture is named after the item. I want to save them from top to bottom and give them names like SKU-001 to SKU-009. The important thing is they need to be named in the order which they appear so that when I upload them, they are associated with the right product.

Can anyone help, and if you cannot, can you link me to another forum that can? Thanks much in advance
 

ZiKi

Senior member
Mar 19, 2004
417
0
0
Well I was wondering if there was software out there already that did this. It sounds like something that could be done in linux or something
 

dinkumthinkum

Senior member
Jul 3, 2008
203
0
0
Typically wget or curl are the popular programs for downloading website data. Then you can hack up a quick script to do the rest.
 

DaveSimmons

Elite Member
Aug 12, 2001
40,730
670
126
There are many "spider" and website download applications, but matching items to order on the page is not something most users need.

It sounds like you aren't a programmer, so if you can't do the scripting for scraping yourself you can try an existing spider or six and look for one that happens to behave close to the way you want.
 

Cogman

Lifer
Sep 19, 2000
10,278
126
106
FTP into the old website (or however you would connect to the site before hand), find anything ending with jpg, png, gif, ect, and copy. If you are getting images from your old site, you shouldn't need to copy them through the webserver.
 

Net

Golden Member
Aug 30, 2003
1,592
2
81
what's your time frame? if I have the time I can modify my spider to generate a script for you.

here is what the script would look like:

Code:
#!/bin/bash

urls[0]="http://www.jpgheritage.org/sitebuildercontent/sitebuilderpictures/jpg2rh.jpg"
urls[1]="http://www.astronomy2009.org/static/resources/iya_logo_final_b&w.jpg"

for ((c = 0; c <= 1; c++ ))
do
    wget -O SKU-00$c.jpg ${urls[$c]}
done

if your going above 9 then use a nested for loop (SKU-$a$b$c.jpg)

if you can find something that's already done to parse out the .jpg links to a text file then you can open that text file in vim and do:

:g/^/exe ":s/^/urls[".line(".")."]=\""
:&#37;s/$/\"/

that will surround all the urls with urls[1]=" "
urls[2]=" "
etc...

then copy and paste the results of that to the script above and run it
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |