Webmaster Forum

Go Back   Webmaster Forum > Marketing Forums > Google Forum

Google Forum Discuss Google related issues.


Reply
 
LinkBack Thread Tools Display Modes
Old 10-05-2007, 06:46 AM   #1 (permalink)
Junior Member
 
Join Date: 06-22-07
Posts: 29
iTrader: 0 / 0%
Latest Blog:
None

simplyDone is on the right pathsimplyDone is on the right path
Blocking https pages from Google

Hi

I have a site with https versions of other pages on my site which are are for logged in users. Google has indexed these https pages.

How do I stop Google from indexing these https pages, so that only the http versions of the pages are indexed, and so avoid any duplicate content penalties?

Thanks in advance for you help.
simplyDone is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-05-2007, 09:58 AM   #2 (permalink)
Meeow!
 
Costin Trifan's Avatar
 
Join Date: 04-13-07
Location: Romania
Posts: 3,235
iTrader: 0 / 0%
Latest Blog:
None

Costin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest order
Quote:
How do I stop Google from indexing these https pages..
by blocking the robots: http://www.robotstxt.org/wc/exclusion.html
__________________
...to be continued
Costin Trifan is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-05-2007, 10:31 AM   #3 (permalink)
Contributing Member
 
bogs's Avatar
 
Join Date: 08-01-07
Posts: 1,095
iTrader: 0 / 0%
Latest Blog:
Scottish Fold Cat

bogs is a jewel in the roughbogs is a jewel in the roughbogs is a jewel in the roughbogs is a jewel in the roughbogs is a jewel in the roughbogs is a jewel in the rough
thats right or maybe try this one:

Quote:
Allow Googlebot to index all http pages but no https pages

Each port must have its own robots.txt file. In particular, if you serve content via both http and https, you’ll need a separate robots.txt file for each of these protocols.

For your http protocol (http://yourserver.com/robots.txt)

User-agent: *
Allow: /

For the https protocol (https://yourserver.com/robots.txt)

User-agent: *
Disallow: /


askapache.com

or visit: http://www.askapache.com/seo/seo-with-robotstxt.html
bogs is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-05-2007, 12:59 PM   #4 (permalink)
Senior Member
 
noob_0001's Avatar
 
Join Date: 02-17-05
Posts: 595
iTrader: 0 / 0%
Latest Blog:
None

noob_0001 is a splendid one to beholdnoob_0001 is a splendid one to beholdnoob_0001 is a splendid one to beholdnoob_0001 is a splendid one to beholdnoob_0001 is a splendid one to beholdnoob_0001 is a splendid one to beholdnoob_0001 is a splendid one to beholdnoob_0001 is a splendid one to beholdnoob_0001 is a splendid one to beholdnoob_0001 is a splendid one to beholdnoob_0001 is a splendid one to behold
Question

Could this be done with Apache?

Couldn't you have a rewrite rule that states that if the user agent is google and the url starts with https then forbidden?
noob_0001 is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-06-2007, 01:13 AM   #5 (permalink)
Contributing Member
 
janhvizdak's Avatar
 
Join Date: 03-22-07
Location: Sicily
Posts: 381
iTrader: 0 / 0%
janhvizdak is liked by somebodyjanhvizdak is liked by somebodyjanhvizdak is liked by somebody
Quote:
Originally Posted by simplyDone View Post
Hi

I have a site with https versions of other pages on my site which are are for logged in users. Google has indexed these https pages.

How do I stop Google from indexing these https pages, so that only the http versions of the pages are indexed, and so avoid any duplicate content penalties?

Thanks in advance for you help.
When a visitor enters the right login and password, create a cookie which will tell your site engine to allow https. If the cookie doesn't exist, then redirect anyone who is accessing your site through https to an ordinary http.

Simple.

Here is a basic info for 301 redirs: http://www.webconfs.com/how-to-redirect-a-webpage.php

Ah, I forgot... The login form should be accessed through the <form> HTML entity instead of ordinary link.
__________________
50% per sale, payments upon request, no fees: http://www.aqua-fish.net/affiliate.php

Last edited by janhvizdak; 10-06-2007 at 01:19 AM.. Reason: added one line
janhvizdak is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-06-2007, 03:03 AM   #6 (permalink)
Senior Member
 
Join Date: 03-13-07
Posts: 146
iTrader: 0 / 0%
Seo Madrid is liked by somebodySeo Madrid is liked by somebodySeo Madrid is liked by somebody
You can also add a "nofollow,noindex" tag in the pages that you wan´t not to be indexed.
Seo Madrid is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-08-2007, 08:20 AM   #7 (permalink)
Contributing Member
 
olddocks's Avatar
 
Join Date: 04-03-06
Location: hrwebdir.org
Posts: 119
iTrader: 0 / 0%
olddocks is just really niceolddocks is just really niceolddocks is just really niceolddocks is just really niceolddocks is just really niceolddocks is just really niceolddocks is just really niceolddocks is just really nice
Be careful while disallowing certain pages. If you mention something wrong or mess up, Google will de-index your site.

As far as disallowing https pages, best way to do it is create something a subdomain https://secure.domain.com for SSL and then you can restrict that folder in robots.txt

Quote:
User-agent: *
Disallow: /secure
olddocks is offline  
Add Post to del.icio.us
Reply With Quote
Go Back   Webmaster Forum > Marketing Forums > Google Forum

Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
https vs http google serps ccb056 Google Forum 4 02-28-2008 05:35 PM
how to stop crawling of https:// urls from google mitra Google Forum 20 01-25-2008 02:16 AM
Godaddy response to blocking google bot Fivetide Web Hosting Forum 4 11-30-2007 02:15 AM
Godday still blocking google bot? Fivetide Google Forum 2 11-23-2007 10:02 PM
How to know a hosting blocking google bot charlesgan Web Hosting Forum 11 07-23-2007 08:38 AM


Sponsor Links
Get exposure! Contextual Links V7N SEO Blog V7N Directory


All times are GMT -7. The time now is 10:39 AM.
© Copyright 2008 V7 Inc
Powered by vBulletin
Copyright © 2000-2009 Jelsoft Enterprises Limited.


Search Engine Optimization by vBSEO 3.3.0 ©2009, Crawlability, Inc.