FlyerTalk Forums

FlyerTalk Forums (https://www.flyertalk.com/forum/index.php)
-   Technical Support and Feedback (https://www.flyertalk.com/forum/technical-support-feedback-386/)
-   -   We're in Google! (https://www.flyertalk.com/forum/technical-support-feedback/331367-were-google.html)

stimpy Jun 22, 2004 2:35 pm

We're in Google!
 
I'm not sure if this is the right spot, but I was shocked today to see a random recent post of mine show up in a Google search. Is someone on Flyertalk actively adding all our threads to Google?

ScottC Jun 22, 2004 2:42 pm


Originally Posted by stimpy
I'm not sure if this is the right spot, but I was shocked today to see a random recent post of mine show up in a Google search. Is someone on Flyertalk actively adding all our threads to Google?

Google "spiders" pretty much every website, with VBulletin it's probably grabbing much more than it used to on UBB.

Aviatrix Jun 30, 2004 3:16 pm


Originally Posted by ScottC
Google "spiders" pretty much every website, with VBulletin it's probably grabbing much more than it used to on UBB.

I'm not an expert, but I know there is something you can do to stop web sites from being indexed by search engines - you just need to put in the right bit of code.

DaveSF Jul 1, 2004 3:41 am

Easy enough to do
 
There's a standard method of excluding search engines from all or part of websites though the use of a file named "robots.txt." In FT's case, I think this might be a good idea, at least for certain parts of the site (e.g., CC) that might attract the "wrong sort" of people/attention from Google surfers...

See http://www.google.com/webmasters/3.html#B3


. I don't want Google to crawl part or all of my site.

There is a standard method involving a "robots.txt" file for excluding robot crawlers. This will prevent Googlebot or other crawlers from visiting your site. Googlebot has a user-agent of "Googlebot". In addition, Googlebot understands some extensions to the robots.txt standard: Disallow patterns may include * to match any sequence of characters, and patterns may end in $ to indicate that the $ must match the end of a name. For example, to prevent Googlebot from crawling files that end in gif, you may use the following robots.txt entry:

User-agent: Googlebot
Disallow: /*.gif$
There is another standard for telling robots not to index a particular web page or follow links on it, which may be more helpful, since it can be used on a page-by-page basis. This method involves placing a "META" element into a page of HTML.

Remember, changing your server's robots.txt file or changing the "META" elements on its pages will not cause an immediate change in what results Google returns. It is likely that it will take a while for any changes you make to propagate to Google's next index of the web.

alanw Jul 1, 2004 8:17 am

Why would FT want to do that?

Kinda defeats the point, doesn't it?

ozstamps Jul 3, 2004 1:26 am

Seems to me that that our posts were always spidered by Google?

Type any user name into Google and you'll find matches even from the old FT AFAIK.

ScottC Jul 7, 2004 6:35 am


Originally Posted by ozstamps
Seems to me that that our posts were always spidered by Google?

Type any user name into Google and you'll find matches even from the old FT AFAIK.

Will you find 20949 matches? Google indexed little portions, and at one point compltely stopped indexing, most google entries on the old board were outdated threads, the purpose of this thread was to mention that Google is indexing more recent stuff.

wharvey Jul 7, 2004 10:39 am

To get this thread to relevance for this forum:

And just what is the TALKBOARD doing about this google relevation? :)

William

stimpy Jul 7, 2004 10:50 am

One thing I think Flyertalk can do is tell everyone (perhaps via Talkmail?) what is going on. I'm sure many of us don't realize that our posts are being exposed in such a way to the outside world. Yes anyone can read Flyertalk, but that takes effort. As we know Google allows people to easily grab the most arcane and buried information. Just a warning to think about Google before you post certain info might be good.

flamboyant 1 Jul 7, 2004 1:21 pm

I got to FT when searching in google a hotel name. I forgot which property, but it made sure my attention got to FT. I am kind of happy and mad about it...all those ours lurking and posting, butI enjoy doing so on the other hand...


^ :) ^

myefre Jul 7, 2004 11:15 pm

I just typed in my user name and got 156 matches and they were all from FT

percussionking Jul 9, 2004 9:26 am

TechTV had an article on this, but they're going through some major changes so I hesitate to post a link; it may soon be invalid.

You have to put robots.txt on each server at the root directory for the website. The one for my webpage looks something like this:

---Start of file---
# robots.txt for http://www.mywebpage.com/
# Disallow: /folder_name/

User-agent: *
Disallow: /
---End of file---

This will prevent all search engines using this method from looking in any of my folders. The first two lines are completely unnecessary; they are just instructions for the person writing the file.

Spiff Jul 9, 2004 5:30 pm


Originally Posted by wharvey
To get this thread to relevance for this forum:

And just what is the TALKBOARD doing about this google relevation? :)

William

I'm taking action!

Off to Technical Forum with this topic! :)

jcrb Jul 9, 2004 8:11 pm

I mentioned this to Randy a while ago, that we didn't used to apear in the search engines but after the switch over it apears the default setting went from no robots to letting them crawl the site. I think I would feel alot better if at a minimum they didn't indext OMNI :D and probably a number of other boards, such as the the rental car forums that tend to be all about codes etc and trip reports and itineraries where people tend to put out information that they dont really want to share with the whole wide world.

Maybe its just me, but I think it would be a good idea if most of the site did not apear in the search engines.

flamboyant 1 Jul 10, 2004 2:07 am

Maybe its just me, but I think it would be a good idea if most of the site did not appear in the search engines.


Exactly my thought. There is a lot of personal information disclosed here that we do share with our FT friends but prefer not to share with the airlines, rental car companies, etc.

-Links to Targeted Promotions,
-CDWs and AWDs and other Discount codes
-Information about flights taken and description of service attendants
-Information about personal habits (e.g. in OMNI!)
-Links to photos of my family as part of trip reports
-Photos of many FTers can be access thru this site as well as personal websites and pages with the real names and not just the FT handle

Please make sure that in searches the mentioned above cannot be found and matched to the e-mail address given for FT registration purposes.


All times are GMT -6. The time now is 8:31 am.


This site is owned, operated, and maintained by MH Sub I, LLC dba Internet Brands. Copyright © 2026 MH Sub I, LLC dba Internet Brands. All rights reserved. Designated trademarks are the property of their respective owners.