Hi all;

I'm not sure if this is a bug.

If I download this file[1], unzip and do:

grep "<title>" wikiindexorg-20110409-history.xml | sort | uniq -D

It shows:

    <title>Felix Pleşoianu Wiki</title>
    <title>Felix Pleșoianu Wiki</title>
    <title>ᐧᐃᑭᐱᑎᔭ</title>
    <title>위키낱말사전</title>
    <title>ウィクショナリー</title>
    <title>언사이클로피디어</title>
    <title>ไทย Wikipedia</title>
    <title>한국어 Wikipedia</title>

But obviously, they are all different lines. Why?

Thanks,
emijrp

[1] http://code.google.com/p/wikiteam/downloads/detail?name=wikiindexorg-20110409-history.xml.7z