Archive

Posts Tagged ‘web analytics’

Gotchas of Foreign-Language SEO

A passion of my life for some time has been in figuring out the details of foreign-language and foreign-character-set SEO.  How do you do Search Engine Optimization for foreign character sets – and specifically SEO on languages that do not use traditional roman characters, but instead use Cyrillic, Kanji, Mandarin or Greek characters? 

SEO is getting to be more and more a normal thing to do, and less and less of a hidden black art.  Google has made it plain enough times that what they want is good, fresh, updated, relevant content,  and not a bunch of garbage.

Pursuant to that, you’ve got a ton of fairly-well-documented best-practices for SEO’ing your site.  And, if you don’t know the first thing about SEO at all — well — read a good book on the subect.  My favourites are:

Or you can just hit SEOMoz or SEOBook for some hot tips.

But one unfortunate thing is that most of the best SEO data is coming people who are ignorant Americans like me.  Despite my love of geography and far-off places, I can speak no foreign languages fluently, except for some Korean bad words I learned from fellow soccer players.

What does that have to do with anything?

Take the preceding picture I just linked to where I’m doing a soccer throw-in.

Assuming you could edit that page, if you ask any search engine novice to optimize that page to show up well for its subject matter, they’d probably tell you to hit the easy things first.  They’d tell you to optimize:

- HTML <title> tag
- <meta description=> tag
- <meta keywords=> tag
- <H1> text
- Body text
- text of inbound links
- filename of the page

imageIdeally your page would have “Soccer Throw-In” or a more unique title and <h1> text, and would have a description and set of meta keywords that followed along.  Ideally, as well, you’d have a filename like “/soccer-throw-in.html” or similar.

Easy, right?  Of course it is — in English.

But, let’s say you have similar items in German, or worse, Japanese, Greek and Russian!

As an example, the Japanese word for “soccer” is “サッカー“.  What do you make as the page title for that?  The filename?

If you do a google.jp search for “サッカー“, one of the first results you get is a Wikipedia article for “サッカー” which has a displayed URL of:

image

Now, of course, anyone with any technical sense will tell you that you can’t put non 7-bit ASCII URLs into an HTTP request, as that violates the spec.

But of course, pasting such a URL into your browser automatically decodes it to:

http://ja.wikipedia.org/wiki/%E3%82%B5%E3%83%83%E3%82%AB%E3%83%BC

So, it has the benefit of (a) showing up with the proper Japanese term in the search engine result page, improving the apparent relevance of the result, and (b) well showing up at all in the top 10 listings at all — so you’d think it has SOME positive impact in ranking.

European terms are much easier, as there are common transliterations for many of the non-7-bit-ASCII characters that one would use in normal usage.

imageFor example, Google for the beautiful German city of Düsseldorf.  Clearly, one wouldn’t want to have to title all one’s pages as “Dusseldorf” as that would mean “village of idiots” as opposed to Düsseldorf which refers to the small tributary of the River Rhine.   The u umlaut is easily transliterated to “ue” generally, so by Googling for “Duesseldorf” you get an acceptable result – as Google knows what you’re talking about.

Not so easy with these other languages like Greek, Hebrew, Hindi, etc.

I’m very interested for any input or feedback on this, as it’s a massive gray area right now — and I don’t know if ANYONE has this one covered well.

Technorati Profile

Is Log Analysis for Web Analytics a Dead Subject?

web analytics presentation of Webtrends

Web Analytics research project in Atlanta

I’ve been involved in web analytics in one capacity or another for about 11 years.  Back in 1998, when I was first getting started on this when working for Webtrends, there were only two ways to go about getting stats from your website:  either get a program to crunch the logs for your site (of which there were many), or pay some ridiculous sum for a tool like Aria or NetGenesis or HitList which used packet-sniffers placed in front of your web server, to track various interactions with your site.

In either case, the entire subject of web analytics assumed that ever interaction that your users were doing with your site was going to result in another page request back to the server, which you could then track via a log file. 

image Wow, how times have changed.  Try going to a site like the main video channel for the Church of Scientology, or this one for the Volunteer Ministers.  You can complete an hour-long stay at either website, and still have only looked at one HTML page.  Doesn’t make log-based analytics too entertaining, especially when your videos are hosted off-site.

In late 2004, I did a project for for a company in Atlanta, testing out literally about 30 different web analytics solutions to work out the best one for them.  At that point in time, out all the different packages that I reviewed, there was a pretty even split between the number of web analytics companies that were making a go at it with a shrink-wrap, software-based solution that would be hosted at the client side, and the other half were ASP’s. 

Many organizations that I was working with were not all too interested in turning to an ASP-based setup for various security reasons, as well as the fact that ASP’s aren’t too great at doing analytics for  intranet sites, when the client can’t access the external internet. 

At that point in time, the landscape looked something like this:

Analytics Product ASP or Software Analysis Type
Webtrends Either Log Analysis or Page-tagging
Datanautics G2 Software Log Analysis & Packet Sniffer
DeepMetrix LiveStats xSP Software Log Analysis
Pilot HitList Software Log Analysis & Packet Sniffer
Sane NetTracker Software Log Analysis or Page-tagging
Sawmill Software Log Analysis
SPSS NetGenesis Software Log Analysis, Page-tagging and Packet Sniffer
Urchin Software or ASP Log analysis or Page-tagging
Eloqua ASP Page-tagging
Elytics EAS ASP Page-tagging
Manticore Virtual Touchstone ASP Page-tagging
Omniture SiteCatalyst ASP Page-tagging, but they’d import your old logs for a fee
SageMetrics SageAnalyst ASP log analysis and page-tagging
WebSideStory HBX ASP Page-tagging, but they’d import your old logs for a fee

 

As you can see, it was about half-and-half, with most products still clinging to log analysis, but many more progressive (and sometimes completely frightening) products going to page-tagging exclusively to be able to trap and coordinate interactions with your sites.

But now, boy has the landscape changed.   After Google bought Urchin and transformed it into a free product, countless companies have now successfully experimented with Google Analytics and found it (and with it, the whole premise of tagging pages) to be a reliable and insightful way of tracking interactions with pages. 

Also, the fact of its being free has forced a lot of companies to either (a) drop the web analytics business alltogether, or (b) dramatically change their model so as to differentiate themselves from Google Analytics and move themselves way upmarket.

Check out how this grid looks now, 5 years later:

Analytics Product Still around?  If so, what type of product Analysis Type
Webtrends ASP or Software Log Analysis or Page-tagging
Datanautics G2 Product is discontinued  
DeepMetrix LiveStats xSP Bought by Microsoft, and then deep-sixed  
Pilot HitList Product bought by SAP and then deep-sixed  
Sane NetTracker Acquired by Unica, now merged in with their Net Insight software product. Log Analysis or Page-tagging
Sawmill Still Software Log Analysis
SPSS NetGenesis Toasted by SPSS?  Product page is now a 404  
Urchin Bought by Google, is now Google Analytics Page Tagging
Eloqua ASP Page-tagging
Elytics EAS Dead like a doornail  
Manticore Virtual Touchstone ASP Page-tagging
Omniture SiteCatalyst ASP Page-tagging, but they’d import your old logs for a fee
SageMetrics SageAnalyst ASP log analysis and page-tagging
WebSideStory HBX Bought by Omniture, old HBX product is toast  

In any case, it’s a subject for another blog post as to what companies have to do to differentiate themselves from Google Analytics in order to make it worth the cash for users to upgrade from a free product.

The main case here, though, is if log analysis has any value or relevance still in the market?   What do you get from a log analysis tool these days that you can’t get from a pixel tracker?

Author: TurboDad Categories: web analytics Tags: , ,