Webmaster Forum


Go Back   Webmaster Forum > Marketing Forums > SEO Forum
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

SEO Forum Search engine optimization discussions.

   

Reply
 
LinkBack Thread Tools Display Modes
Old 10-13-2007, 02:28 PM   #1 (permalink)
Banned
 
Rulu's Avatar
 
Join Date: 10-11-07
Posts: 240
iTrader: 0 / 0%
Latest Blog:
None

Rulu is liked by many
Unhappy Search engines unable to read coding properly

I need the help of your intelligent eye. A friend of mine loaded my index.html page into his software and submitted to many search engines in May and then in July. To date my site cannot be brought up by entering a keyword into search engines. When I entered my URL at sutrasmantras.info into Google, it gives me the first Doctype tag in my source program as follows:
˙ţ< ! D O C T Y P E H T M L P U B L I C " - / / W 3 C / / D T D ...
˙ţ< ! D O C T Y P E H T M L P U B L I C " - / / W 3 C / / D T D H T M L 4 . 0 1 T r a n s i t i o n a l / / E N " " h t t p : / / w w w . w 3 . o r g / T R / h t m l 4 / l o o s e . d t d " > < h t m l > < h e a d > < M E T A H T T P ...

All of my web pages have passed the W3 School's site validation, and I don't understand the problem. The source program of my home page is copied below for your scrutiny. I'd appreciate your comments and suggestions. Thanks a lot.
-------------------
Code:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html> <head> <META HTTP-EQUIV= "Content-Type" CONTENT= "text/html; CHARSET=UTF-8"> <title>Buddha Sutras Mantras Sanskrit</title> <META NAME="Description" CONTENT="This website presents the English translations of a few Mahayana Sutras, Buddhist prayers and mantras, Buddhist terms, as well as a pronunciation guideline for the Sanskrit alphabet."> <META NAME="Keywords" CONTENT="buddhist sutras, buddhist mantras, sanskrit pronunciation, sanskrit alphabet, mahayana sutras, buddhist prayers, buddhist terms, chinese pure land school, amitabha buddha"> <META NAME= "AUTHOR" CONTENT= "Rulu"> </head> <body bgcolor="#ffffee" text="#000000" link="blue" vlink="purple" alink="green"> <div align="center"> <BR> <p><img src="buddha8light.jpg" width= "312" height="522" alt="Buddha8light"> <BR><BR> <img src="lotus2.jpg" width="72" height="60" alt="Lotus"><img src="lotus2.jpg" width="72" height="60" alt="Lotus"><img src="lotus2.jpg" width="72" height="60" alt="Lotus"><img src="lotus2.jpg" width="72" height="60" alt="Lotus"><img src="lotus2.jpg" width="72" height="60" alt="Lotus"><img src="lotus2.jpg" width="72" height="60" alt="Lotus"><img src="lotus2.jpg" width="72" height="60" alt="Lotus"> </p> <h1><img src="lotus2.jpg" width="36" height="30" alt="Lotus"><img src="lotus2.jpg" width="36" height="30" alt="Lotus">* Buddha Sutras Mantras Sanskrit* <img src="lotus2.jpg" width="36" height="30" alt="Lotus"><img src="lotus2.jpg" width="36" height="30" alt="Lotus"> </h1> <BR> <p><font face= "Tahoma" size ="+1"> oṁ mune mune mahāmunaye svāhā<br> One who reads a sutra is receiving the teachings of the Buddha in His presence.</font></p> <h4><font face = "Tahoma"><A HREF = "intro.html">Please enter</A></font></h4> <p> <img src="valid-html401.gif" alt="Valid HTML 4.01 Transitional" height="31" width="88"> <br> The files on this website may be freely downloaded for use but not for sale. <br> Updated 9/09/2007 </p> </div> </body>
</html>

Last edited by chicgeek : 10-14-2007 at 03:56 AM.
Rulu is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-13-2007, 02:31 PM   #2 (permalink)
Southern Brat
 
Cricket's Avatar
 
Join Date: 10-13-03
Location: Texas
Posts: 16,161
iTrader: 0 / 0%
Cricket is supreme webmaster materialCricket is supreme webmaster materialCricket is supreme webmaster materialCricket is supreme webmaster materialCricket is supreme webmaster materialCricket is supreme webmaster materialCricket is supreme webmaster materialCricket is supreme webmaster materialCricket is supreme webmaster materialCricket is supreme webmaster materialCricket is supreme webmaster material
This isn't a code issue. This is an SEO issue. Read, study, and implement the techniques taught in the following thread and you will be fine. http://www.v7n.com/forums/seo-forum/...o-v7n-way.html


Moving this thread to the SEO Forum.
__________________
GNC Web Creations - Free Search Engine Optimization Training Class
Website Development Training - Website Development Training Blog

What are you waiting for? Submit your site to directory.v7n.com today!
Cricket is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-13-2007, 02:48 PM   #3 (permalink)
Banned
 
Rulu's Avatar
 
Join Date: 10-11-07
Posts: 240
iTrader: 0 / 0%
Latest Blog:
None

Rulu is liked by many
Hello Cricket,

At your suggestion, I have read John Scott's thread on optimizing search engines. I have chosen the keywords carefully, but search engines have not been able to read my keywords, title, and content in the meta tags. That's why I thought mine would be a coding problem. Thank you.
Rulu is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-15-2007, 03:45 AM   #4 (permalink)
Contributing Member
 
Steven_D's Avatar
 
Join Date: 09-02-07
Location: In my own little world
Posts: 497
iTrader: 1 / 100%
Steven_D is liked by somebodySteven_D is liked by somebodySteven_D is liked by somebodySteven_D is liked by somebody
I think he is right, if you google his domain then click cached page it comes up as static text.

˙ţ< ! D O C T Y P E H T M L P U B L I C " -

There is some charactors in front of your DOCTYPE tag, remove them, i would imagine it is a language thing, they are likely put there by the program you used to build the site.

dont tell me they dont exist because I can see them clear as day in the cached page. Open the pages in something like textpad and you should be able to see them, remove them and save them as html file from textpad
Steven_D is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-15-2007, 08:48 AM   #5 (permalink)
Banned
 
Rulu's Avatar
 
Join Date: 10-11-07
Posts: 240
iTrader: 0 / 0%
Latest Blog:
None

Rulu is liked by many
Hello Steven,

Thank you for your observation. When I opened my files stored at my present web host, I saw that before each of my Doctype tag, their system has inserted this code 'yp.' Their tech consultant told me that when my website is displayed, this code does not show up in the drop-down source program. They are right about that. (If you check it, you will not see any 't' or 'yp' code there.) They assured me that search engines will not see this code since they cannot open my files stored with the hosts and that they should index my web pages properly. They are wrong there.

All the files created on the Notepad on my PC, of course, do not have any extra characters before the Doctype tag. I did not use a special software to create my files. I just typed html codes into the Notepad. Thanks again
Rulu is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-15-2007, 09:58 AM   #6 (permalink)
Contributing Member
 
x3mario's Avatar
 
Join Date: 07-24-07
Posts: 1,467
iTrader: 0 / 0%
x3mario is liked by somebodyx3mario is liked by somebodyx3mario is liked by somebodyx3mario is liked by somebody
Steven is right. Remove the "˙ţ" characters on the first line. That is "˙ţ< ! D O C T Y P E H T M L P U B L I C " -"..
__________________
Gems Collections | Fine Diamond Jewelry
x3mario is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-15-2007, 01:57 PM   #7 (permalink)
Banned
 
Rulu's Avatar
 
Join Date: 10-11-07
Posts: 240
iTrader: 0 / 0%
Latest Blog:
None

Rulu is liked by many
Hello x3mario,

This "˙ţ" is created by Google, not in my file. How do I delete it?
Rulu is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-15-2007, 02:05 PM   #8 (permalink)
Contributing Member
 
Steven_D's Avatar
 
Join Date: 09-02-07
Location: In my own little world
Posts: 497
iTrader: 1 / 100%
Steven_D is liked by somebodySteven_D is liked by somebodySteven_D is liked by somebodySteven_D is liked by somebody
Quote:
Originally Posted by Rulu View Post
Hello x3mario,

This "˙ţ" is created by Google, not in my file. How do I delete it?
It cant be added by google as google does not have write access to your web server.

As you stated,

They assured me that search engines will not see this code since they cannot open my files stored with the hosts and that they should index my web pages properly. They are wrong there.

The charactors are in the file, if you check the page source with IE or FF you cant see those charactors because IE and FF will remove any charactors before the doctype tag as they know it is not part of the webpage, HOWEVER googlebot does not scan your site with IE or FF and their bot is picking up the code.

To see what googlebot is seeing click this link

http://www.gritechnologies.com/tools...smantras.info/

it is a link to poodlebot which reads your site like a search engine, the other option is to download lynx text browser and find your site.

Long story short, their system is adding shit to your site which is messing it up, ask them to remove it, or give you the ability to remove it, or change host. Hosting a site is piss easy and if they are messing it up that bad they dont deserve your business.

Dont forget my karma if this helps. Some bastard gave me negative karma lol
Steven_D is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-15-2007, 02:55 PM   #9 (permalink)
Banned
 
Rulu's Avatar
 
Join Date: 10-11-07
Posts: 240
iTrader: 0 / 0%
Latest Blog:
None

Rulu is liked by many
Hello Steven,

Thanks a lot for the link and your insight. The diagnostics of Googlebot say this:
Warning: No h1, h2 or h3 Headings were found.
No Title Specified
ÿþ

It is amazing how Googlebot picks up this hidden 'ÿþ' code generated by the system of my host. You mean this "˙ţ" displayed by Google is sort of converted from the 'ÿþ' code?

When I first detected the ÿþ code in my files stored with my host, I deleted each of them, file by file. The tech support of my web host told me that I should never edit the stored files. I should always edit files on my own PC and then ftp them. So I re-uploaded my files. Of course, the ÿþ codes were then restored.

By the way, I got the same "˙ţ" symbol from Google when I was with my first web host brinkster.com, but their stored files did not show this ÿþ code. My html teacher, who had submitted my home page to search engines, suggested that I change web host, and I ended up with this second host prestigephp.com. Weak in technology, I am not sure what to say to them.

Your good karma is well documented in your own alaya consciousness. I will also remember your kindness.
Rulu is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-15-2007, 04:03 PM   #10 (permalink)
Banned
 
Rulu's Avatar
 
Join Date: 10-11-07
Posts: 240
iTrader: 0 / 0%
Latest Blog:
None

Rulu is liked by many
Hello Steven,

I have written to my web host, quoting the strong evidence you have given me. Their tech support has replied that they are sending my case to their administrator and will get back to me.

With my luck, it seems this 'yp' bug is popular with web hosts. How can I determine whether or not the next one is also using it? The sales staff usually don't know this kind of thing. I am totally exhausted by this unexpected problem for my simple website and would like to settle down. Thanks again.
Rulu is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-15-2007, 07:28 PM   #11 (permalink)
Contributing Member
 
Steven_D's Avatar
 
Join Date: 09-02-07
Location: In my own little world
Posts: 497
iTrader: 1 / 100%
Steven_D is liked by somebodySteven_D is liked by somebodySteven_D is liked by somebodySteven_D is liked by somebody
I dont think it is something that they will tell you about before you sign up with them. I am assuming it is some sort of free or shared hosting.

The tech is just scaring you about the never edit files on the ftp server, if you have some sort of remote access then edit them, then re visit the poodle-bot page and press f5 to refresh it and you will quickly see if it has been resolved.

Wait to hear back from their tech team and go from there, if they are willing to find a solution then I would stay with them as not many hosts will bend to suit peoples needs due to the amount of requests they get, if they cant / wont fix it then just edit the pages manually on the server but they may have a system that checks and re adds the code every day or so, so check it for a week or two.

Can you give us the domain name for the host so that others know to stay clear.
Steven_D is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-15-2007, 09:25 PM   #12 (permalink)
Banned
 
Rulu's Avatar
 
Join Date: 10-11-07
Posts: 240
iTrader: 0 / 0%
Latest Blog:
None

Rulu is liked by many
Hello Steven,

Yes, I can go to their control panel and edit my files stored with them. Except for the index.html file, all my other files contain Chinese characters. With their 'yp code in place, the Chinese characters display well on my website as well as in the drop-down source program.

However, I suspected that the 'yp' code would cause a problem with the search engines. In July, when I deleted the 'yp' code in front of each Doctype tag, all the Chinese characters would be garbled. Then I had to copy the correct ones from the files on my PC manually into their files. When their files were saved, all the Chinese characters in their files would automatically turn into codes. Although Chinese characters were still displayed correctly on my website, they are shown as codes in the drop-down source program. (I did not know what other visitors could see.) Imagine that whenever I edit my files, I would have to copy the changes manually into their files because uploading files will cause their 'yp' code to be inserted.

When I took this problem to my host, a team of tech support successively told me that I should only ftp files and trust that they should work. This web host is at prestigePHP.com. I must give them credit for responding quickly, not necessarily correctly in my case. It took my previous host Brinkster.com a day or two to give me a reply, and they said any trouble with search engines would be caused by my own coding problems--Case closed.

Thanks a lot again.

Last edited by Cricket : 10-16-2007 at 12:35 AM.
Rulu is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-15-2007, 09:28 PM   #13 (permalink)
Contributing Member
 
Steven_D's Avatar
 
Join Date: 09-02-07
Location: In my own little world
Posts: 497
iTrader: 1 / 100%
Steven_D is liked by somebodySteven_D is liked by somebodySteven_D is liked by somebodySteven_D is liked by somebody
Ok i get you, the yp is to tell the browser that it uses international language packs, I figured it was just junk.

You are going to need to find another method for coding the chinese charactors into your site. Let me have a play with it for an hour or so, and ill come back with something that works..... hopefully
Steven_D is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-15-2007, 09:42 PM   #14 (permalink)
Contributing Member
 
Steven_D's Avatar
 
Join Date: 09-02-07
Location: In my own little world
Posts: 497
iTrader: 1 / 100%
Steven_D is liked by somebodySteven_D is liked by somebodySteven_D is liked by somebodySteven_D is liked by somebody
Rulu, I would really like to help you but I think this is something that is just a little outside my skill set. I took a closer look at your code and I just dont understand how it was coded.

Things like this @ the very bottom of the page

<p align = "right">
Updated 9/09/2007

</p> (LOL THE FORUM REMOVED ALL THE #160's THERE WERE LIKE 15 OF THEM, CHECK YOUR PAGE SOURCE)

What is the point of all the #160's

In theory, your chinese characters should not be showing up as you are using CHARSET UTF-8, so I am assuming that you are using the codes (Sorry I dont know what they are called) where you put in like #47jfhuyt and it displays a chinese character.

There are a lot of sites having the same problem as you (http://72.14.253.104/search?q=cache:...&ct=clnk&cd=14), the only thing I can suggest is that you wait for someone else who has had this same problem to fix it or to remove the Chinese characters from your site..... (Prob not the best solution)

I just dont want to tell you something that is wrong. I hope you can solve it. After closer analysis, I dont think it is the host it is the code on your html pages.

sorry
Steven_D is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-16-2007, 11:08 AM   #15 (permalink)
Banned
 
Rulu's Avatar
 
Join Date: 10-11-07
Posts: 240
iTrader: 0 / 0%
Latest Blog:
None

Rulu is liked by many
Hello Steven,

Thank you for check the coding on my intro.html page. The default of spacing between words is one space only. To creat more spaces we can use the html character code either (&#nbsp or ( ). I put in a number of these blank characters like fillers, so that the right alignment will not hit the extreme right.

The tech supprt has come back to me, saying that 'yp' is no bug. It is the way their system reading my codes. Specifically, they say
-------------
The current encoding format of your file is UCS-2 Little Endian You can see that using an advanced text editor. This is what causes this issue.

Notepad can create text files with every encoding that the Windows is configured to, which means that if the windows is configured to use the encoding mentioned, it will use it in every single simple text file created.
I would suggest you to download Notepad++
(http://notepad-plus.sourceforge.net/) and set the 'Format' to 'UTF - 8' and then test the issue again.
---------------------------
I saved all of my files created on the regular Notepad and saved them under UNICODE, according to the online html tutorials. I don't quite understand why they call it USC2-Little Endian, a format I have not learned. I can just easily resave my files under UTF-8, an option allowed on the regular Notepad. However, I will follow their instruction and download the Notepad plus if this can solve the problem. If I succeed, I surely will share the good news with you.

I have contacted a wibsite building company associated with a member of this Forum. They said that they saw coding problems on my pages. It is going to be a $600 job to fix them. I have not responded to them.

It is amzaing that all my pages have passed the W3C School's site validation. I take to mean that my coding is good enough to them. If resaving all of them under UTF-8 on the Notepad plus will solve the problem, I should test this option first. I am still pleased with the responsiveness of my web host. If my problem is resolved, I should recommend them highly.

Thank you again for your kindness.
Rulu is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-16-2007, 01:25 PM   #16 (permalink)
Banned
 
Rulu's Avatar
 
Join Date: 10-11-07
Posts: 240
iTrader: 0 / 0%
Latest Blog:
None

Rulu is liked by many
Hello Steven,

I answered in my previous thread your question about the numeral #160 in my program, but it was edited by the Forum. The html code for a blank space is either (&#nbsp or (&#160:), where the colon should be actually be a semicolon. Forum has replaced the first semicolon with a smilee, but did not edit the second semicolon. That's why you see ( ) in the first paragraph.
Rulu is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-16-2007, 06:56 PM   #17 (permalink)
Contributing Member
 
Steven_D's Avatar
 
Join Date: 09-02-07
Location: In my own little world
Posts: 497
iTrader: 1 / 100%
Steven_D is liked by somebodySteven_D is liked by somebodySteven_D is liked by somebodySteven_D is liked by somebody
Quote:
Originally Posted by Rulu View Post
I have contacted a wibsite building company associated with a member of this Forum. They said that they saw coding problems on my pages. It is going to be a $600 job to fix them. I have not responded to them.
Lol it would cost less to have someone rebuild your site in PHP.

Dont use notepad+ use textpad.

add me to msn: steven@walkoffwithyourcompetitorsmoney.com

I can prob fix the problem for you based off the information provided above.
Steven_D is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-18-2007, 04:34 PM   #18 (permalink)
Banned
 
Rulu's Avatar
 
Join Date: 10-11-07
Posts: 240
iTrader: 0 / 0%
Latest Blog:
None

Rulu is liked by many
Hello all,

I did not think my problem was soluble, but it has been resolved in an unexpected way. Although the tech support of my web host misinformed me in July that the ‘yp’ code in front of my Doctype tag was not a problem and that search engines should be able to read my web pages, I am still pleased that they now are willing to reopen my case and to make constructive suggestions.

After months’ of frustration, what I have learned is summarized below:
(1) My files initially saved in the UNICODE format on the regular notepad on my PC had turned into an Endian format on their side. Their File Manager had inserted this ‘yp’ code in front of the Doctype tag, and search engines could not read beyond it.
(2) Files saved in the UTF-8 format on either the regular notepad or Notepad++ (recommended by one consultant) will still provoke their File Manger to insert a code (not ‘yp’) in front of the Doctype tag.
(3) Files saved in the UTF-8 format on PSPad (recommended by another consultant) can prevent their File Manager from inserting a code.

Online html tutorials tell students to use the regular notepad and save files in the UNICODE format, but it has been phased out and replaced by an Endian format. If you look at the menu of the text editor Notepad++ or PSPad, you will find that its format includes UTF-8, 2 versions of Endian, etc, but not UNICODE. I am lucky that after three days of e-mail correspondence with tech support, when everyone was giving up, one of their consultants happened to mention PSPad, which can conquer their File Manager. My files on their system are finally free of any special code, and search engines will be able to read my web pages.

I appreciate the link to the Poodle Predictor that Steven has posted here. It is very useful. Thank you again. I would highly recommend my web host prestigePHP.com. They have helped me the way my first web host would not do.

Maybe you don’t have this kind of problems if you are using FrontPage or Dream Weaver, or if your File Manager is powerful enough to handle phased out formats. My harrowing experience is reported above for your information.
Rulu is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-18-2007, 05:05 PM   #19 (permalink)
Contributing Member
 
Steven_D's Avatar
 
Join Date: 09-02-07
Location: In my own little world
Posts: 497
iTrader: 1 / 100%
Steven_D is liked by somebodySteven_D is liked by somebodySteven_D is liked by somebodySteven_D is liked by somebody
glad you got it sorted mate. Having a hard time understanding how because PSPad was used the file manager didnt modify it but the fact it is working is all that matters.

Steven_D is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-19-2007, 02:50 AM   #20 (permalink)
nic
Junior Member
 
Join Date: 10-19-07
Location: Australia
Posts: 28
iTrader: 0 / 0%
Latest Blog:
None

nic is liked by many
See the META tag for content in your pages,
META HTTP-EQUIV= "Content-Type" CONTENT= "text/html; CHARSET=UTF-8"

remove the
UTF-8
and replace it with
iso-8859-1

also save the page as ANSI text file with the notepad not as UTF-8 and
not as any other charset format(re the save as file chooser choice at the bottom).

That format and charset are usually used for XML direcly if anything.

Note: for almost all english world text on internet your using
iso-8859-1 or -2 and ascii standard text.
The charset machine numerics are different

also
a href =
should be
a href=
(no spaces at equal signs)
nic is offline  
Add Post to del.icio.us
Reply With Quote