Webmaster Forum


Go Back   Webmaster Forum > Web Development > Web Design Lobby > Coding Forum
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

Coding Forum Problems with your code? Let's hear about it.

Ezilon Directory   High Bandwidth Dedicated Servers   V7N Directory

Reply
 
LinkBack Thread Tools Display Modes
Old 02-14-2008, 11:53 AM   #1 (permalink)
Junior Member
 
Join Date: 02-14-08
Location: Mount Dora, FL
Posts: 13
iTrader: 0 / 0%
Latest Blog:
None

JoeTMuse is liked by many
Pulling content from search engine results...

Hello all! New to the forum, and I could use some help... I need to compile a list from search engine results. Is there any way to do this? Thanks for the help...

@
JoeTMuse is offline  
Add Post to del.icio.us
Reply With Quote
Sponsored Links
SEO Hosting by HostGator  Advertise Here  Buy Blog Links
Old 02-14-2008, 01:26 PM   #2 (permalink)
Contributing Member
 
Join Date: 05-18-04
Location: Florida
Posts: 966
iTrader: 0 / 0%
Latest Blog:
None

pinkfluffybunny is just really nicepinkfluffybunny is just really nicepinkfluffybunny is just really nicepinkfluffybunny is just really nicepinkfluffybunny is just really nicepinkfluffybunny is just really nicepinkfluffybunny is just really nicepinkfluffybunny is just really nicepinkfluffybunny is just really nice
You mean like your current positions, gain loss etc.

Webposition pro
__________________
Just because you're paranoid doesn't necessarily mean people aren't out to get you
pinkfluffybunny is offline  
Add Post to del.icio.us
Reply With Quote
Old 02-14-2008, 01:46 PM   #3 (permalink)
Junior Member
 
Join Date: 02-14-08
Location: Mount Dora, FL
Posts: 13
iTrader: 0 / 0%
Latest Blog:
None

JoeTMuse is liked by many
No, more along the lines of pulling actual search results and compiling a list of those results in an easy on the eyes form. Example: searching for Bars/Night Clubs in Michigan on Yahoo Yellow Pages, compiled into a list of these without all the other clutter. I don't know if these is even possible... Thanks for any help though...

@
JoeTMuse is offline  
Add Post to del.icio.us
Reply With Quote
Old 02-14-2008, 02:19 PM   #4 (permalink)
v7n Mentor
 
Join Date: 07-24-06
Posts: 642
iTrader: 1 / 100%
Latest Blog:
None

nasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nice
Use Seo4FireFox extension by Aaron Wall. It has ability to save search results to CSV file.
nasty.web is online now  
Add Post to del.icio.us
Reply With Quote
Old 02-14-2008, 02:36 PM   #5 (permalink)
Junior Member
 
Join Date: 02-14-08
Location: Mount Dora, FL
Posts: 13
iTrader: 0 / 0%
Latest Blog:
None

JoeTMuse is liked by many
Thanks nasty. Is there anyway to make it into an automated system? Instead of saving them for database purposes, I need them to be "live." In essence, I choose the search engine, tell it what to search for, and it fetches the results only including the info I want from the link... including all the following pages of said results. Whew... That was a mouthful. I know it's gotta be a pretty complicated process, and I can't quit scratching my head over it. Thanks again.

@
JoeTMuse is offline  
Add Post to del.icio.us
Reply With Quote
Old 02-14-2008, 02:40 PM   #6 (permalink)
v7n Mentor
 
Join Date: 07-24-06
Posts: 642
iTrader: 1 / 100%
Latest Blog:
None

nasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nice
Scraping is against Googles TOS as I know. You (your IP) can get banned (I know this from my own experience ). Anyway, it's pretty build a scraper with some programming skills. If you know PHP, look at curl functions.
nasty.web is online now  
Add Post to del.icio.us
Reply With Quote
Old 02-14-2008, 02:48 PM   #7 (permalink)
Junior Member
 
Join Date: 02-14-08
Location: Mount Dora, FL
Posts: 13
iTrader: 0 / 0%
Latest Blog:
None

JoeTMuse is liked by many
Thanks nasty, I'll take a look at curl functions. As far as scrapping, what if it's for private use? Nothing published, or being redistributed across the web...

@
JoeTMuse is offline  
Add Post to del.icio.us
Reply With Quote
Old 02-14-2008, 02:53 PM   #8 (permalink)
v7n Mentor
 
Join Date: 07-24-06
Posts: 642
iTrader: 1 / 100%
Latest Blog:
None

nasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nice
From: http://www.google.com/support/webmas...n&answer=35769

Google does not recommend the use of products <...> that send automatic or programmatic queries to Google.
nasty.web is online now  
Add Post to del.icio.us
Reply With Quote
Old 02-14-2008, 03:01 PM   #9 (permalink)
Junior Member
 
Join Date: 02-14-08
Location: Mount Dora, FL
Posts: 13
iTrader: 0 / 0%
Latest Blog:
None

JoeTMuse is liked by many
Well, that sucks! I guess, unless I wish to be banned from google, my little project is over. Although, after reading the link you posted, I think what I'm trying to do is different. Maybe I should just ask google for their point of view? lol Thanks for the help either way.

@
JoeTMuse is offline  
Add Post to del.icio.us
Reply With Quote
Old 02-14-2008, 03:05 PM   #10 (permalink)
v7n Mentor
 
Join Date: 07-24-06
Posts: 642
iTrader: 1 / 100%
Latest Blog:
None

nasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nice
You can try to imitate human behavior. Don't scrape to much at a time (think about user with few browser tabs open) and make a reasonable breaks between requests.
nasty.web is online now  
Add Post to del.icio.us
Reply With Quote
Old 02-14-2008, 03:11 PM   #11 (permalink)
Junior Member
 
Join Date: 02-14-08
Location: Mount Dora, FL
Posts: 13
iTrader: 0 / 0%
Latest Blog:
None

JoeTMuse is liked by many
What about an applet that would run from my end,as apposed to server side, pulling and placing the data in a spreadsheet? Would that be considered scraping, or would that essentially be what Seo4FireFox does?
JoeTMuse is offline  
Add Post to del.icio.us
Reply With Quote
Old 02-14-2008, 03:15 PM   #12 (permalink)
v7n Mentor
 
Join Date: 07-24-06
Posts: 642
iTrader: 1 / 100%
Latest Blog:
None

nasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nice
What's a difference between your computer and so called server? IP address? Google has many visitors form different addresses and it bans for a suspicious behavior not for technology used.
nasty.web is online now  
Add Post to del.icio.us
Reply With Quote
Old 02-14-2008, 03:22 PM   #13 (permalink)
Junior Member
 
Join Date: 02-14-08
Location: Mount Dora, FL
Posts: 13
iTrader: 0 / 0%
Latest Blog:
None

JoeTMuse is liked by many
Ahhh... I see says the blind man. That makes a lot of sense. Thanks for all the help, you've saved me months of headache now that I know it's not allowed.

@
JoeTMuse is offline  
Add Post to del.icio.us
Reply With Quote
Old 02-14-2008, 03:28 PM   #14 (permalink)
v7n Mentor
 
Join Date: 07-24-06
Posts: 642
iTrader: 1 / 100%
Latest Blog:
None

nasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nice
As I said before, you can try to simulate human behavior. If you need small amounts of data - it will work.
nasty.web is online now  
Add Post to del.icio.us
Reply With Quote
Old 02-14-2008, 03:33 PM   #15 (permalink)
Junior Member
 
Join Date: 02-14-08
Location: Mount Dora, FL
Posts: 13
iTrader: 0 / 0%
Latest Blog:
None

JoeTMuse is liked by many
I could do it in small batches. I'll give that a try. So you think cURL's the way to go?

@
JoeTMuse is offline  
Add Post to del.icio.us
Reply With Quote
Old 02-14-2008, 03:40 PM   #16 (permalink)
v7n Mentor
 
Join Date: 07-24-06
Posts: 642
iTrader: 1 / 100%
Latest Blog:
None

nasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nice
Quote:
Originally Posted by JoeTMuse View Post
I could do it in small batches. I'll give that a try. So you think cURL's the way to go?

@
yes, see: http://php.net/manual/en/ref.curl.php
nasty.web is online now  
Add Post to del.icio.us
Reply With Quote
Old 02-14-2008, 03:48 PM   #17 (permalink)
Junior Member
 
Join Date: 02-14-08
Location: Mount Dora, FL
Posts: 13
iTrader: 0 / 0%
Latest Blog:
None

JoeTMuse is liked by many
Sweet! Thanks nasty, 'preciate it!
JoeTMuse is offline  
Add Post to del.icio.us
Reply With Quote
Old 02-14-2008, 03:53 PM   #18 (permalink)
v7n Mentor
 
Join Date: 07-24-06
Posts: 642
iTrader: 1 / 100%
Latest Blog:
None

nasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nicenasty.web is just really nice
Quote:
Originally Posted by JoeTMuse View Post
Sweet! Thanks nasty, 'preciate it!
My pleasure
nasty.web is online now  
Add Post to del.icio.us
Reply With Quote
Old 02-14-2008, 04:51 PM   #19 (permalink)
Junior Member
 
DaGrip's Avatar
 
Join Date: 01-20-08
Posts: 16
iTrader: 0 / 0%
DaGrip is liked by many
Send a message via MSN to DaGrip Send a message via Yahoo to DaGrip
I think there's a bit of confusion here about banning and what that word means in this context.

If you have a script on your website which is doing the scraping, there's a risk of the site being deindexed or 'banned'. Of course there are legitimate ways of doing it. Look into the Google API - they may not want you scraping their results but they do provide you with ways of using their data.

If you have a script on a separate domain which is doing the scraping and eg saving the results to a file or db for the main site to access. The domain doing the scraping may be deindexed but since that domain is just being used for the scraping and running other scripts who cares. Since there's no connection from the SEs viewpoint between the scraping domain and your site, there's no danger to your site.

If a search engine receives too many requests in a given period from the same IP address it is usual for the SE to place an automated temporary (usually 24 hours or less) ban on that IP address from making requests and receiving data - this has nothing to do with search engine indexing it's just a limit on the number of requests an IP can make.

If you are running a tool from your desktop it is your own IP that may be banned. If you're running a script on a server then it's the IP address of that domain which may be banned. Either way using proxies gets around it.

It takes a very high volume of requests to get an IP ban. Simply putting a few seconds delay between requests is usually enough to avoid it.
DaGrip is offline  
Add Post to del.icio.us
Reply With Quote
Old 02-14-2008, 05:21 PM   #20 (permalink)
Junior Member
 
Join Date: 02-14-08
Location: Mount Dora, FL
Posts: 13
iTrader: 0 / 0%
Latest Blog:
None

JoeTMuse is liked by many
Thanks for the info. That's a definite help.

@
JoeTMuse is offline  
Add Post to del.icio.us
Reply With Quote
Go Back   Webmaster Forum > Web Development > Web Design Lobby > Coding Forum

Reply



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Similar Threads
Thread Thread Starter Forum Replies Last Post
Search Engine Ranking Results 100877jamie Google Forum 5 07-27-2008 10:44 PM
Search Engine Optimization Checklist:How to Get Top Rankings in Google Search Results dkgehl SEO Forum 12 02-12-2008 07:33 AM
Search Engine Results sundancerz Marketing Forum 15 07-24-2004 11:21 PM


Sponsor Links
Get exposure! Get exposure! Find Scripts Web Hosting Directory Get exposure! SEO Blog


All times are GMT -7. The time now is 01:01 PM.
© Copyright 2008 V7 Inc