[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Insheeting Japanese

From	"Austin Nichols" <[email protected]>
To	[email protected]
Subject	Re: st: Insheeting Japanese
Date	Tue, 23 Sep 2008 14:35:05 -0400

Dan Weitzenfeld :
Stata's -file- command can deal with this file; see -help file- for
examples of writing a loop to process a file.  But converting in
another program, then using -infile- or -insheet-, is likely easier.
The optimal approach depends on how often you will face this situation
again in future...

On Tue, Sep 23, 2008 at 2:28 PM, Steven Samuels
<[email protected]> wrote:
> Dan, I don't know if Stata can read unicode.  The -help- for -insheet-
> states it is for ASCII text.  One possibility; use a text editor to add
> double quotes (") at the beginning and end of lines and on either side of
> the commas. This may read everything as character.  Then convert the convert
> back to real only the variable you want.
>
> -Steve
>
> On Sep 23, 2008, at 2:19 PM, Dan Weitzenfeld wrote:
>
>> I've been informed that the files are written in unicode, utf-16.  Can
>> Stata read this?
>>
>> On Tue, Sep 23, 2008 at 11:08 AM, Dan Weitzenfeld
>> <[email protected]> wrote:
>>>
>>> Thanks Sergiy, I did not know about that command.  Below is a line
>>> from my hexdump:
>>>
>>>            130 | 304b ff1f 002c 0031 002c 0032 000d 000a |
>>> 0K...,.1.,.2....
>>>
>>> I also noticed this when I ran with option Analyze:
>>>
>>>  Line-end characters
>>>   \r\n         (Windows)             0
>>>   \r by itself (Mac)                  5
>>>   \n by itself (Unix)                 5
>>>
>>> which looks suspicious to me.   I'll talk to the tech guys who made this
>>> file.
>>> Thanks again Sergiy.
>>>
>>>
>>>
>>> On Tue, Sep 23, 2008 at 10:51 AM, Sergiy Radyakin
>>> <[email protected]> wrote:
>>>>
>>>> Dear Dan,
>>>>
>>>> how data "looks like" depends on, which software "looks" at it. From
>>>> what I see in your message, there is double-byte encoding of letters
>>>> which may cause a problem.
>>>>
>>>> I suggest you first "look" at your data byte-by-byte, to find a
>>>> pattern you need, then filter your data based on that pattern.
>>>> Use
>>>>  -hexdump- filename
>>>> to see how your data is structured. Check that you are using correct
>>>> separator "comma" and not "tab", that "comma" in your file is indeed a
>>>> standard ASCII "comma" and not some weird two-bytes comma, that a
>>>> "comma" byte (44) is not used for encoding other characters, etc.
>>>>
>>>> Perhaps you could post a portion of output from hexdump here if this
>>>> does not contradict any rules of the list.
>>>>
>>>> Regards, Sergiy Radyakin
>>>>
>>>>
>>>> On Tue, Sep 23, 2008 at 1:09 PM, Dan Weitzenfeld
>>>> <[email protected]> wrote:
>>>>>
>>>>> Hi All,
>>>>> Quick but strange question.  I'm trying to insheet a comma-delimited
>>>>> file with Japanese in it.  For example, the first line looks like:
>>>>>
>>>>> あなたはこのＣＭが好きですか？,0,とても好き
>>>>>
>>>>> The only information I need is the second variable, the 0, which will
>>>>> always be numeric.
>>>>>
>>>>> However, when I insheet the file, I get nonsense:
>>>>>
>>>>> þÿ0B0j0_0o0S0nÿ#ÿ-0LY}0M0g0Y0Kÿ                 0h0f0‚Y}0M
>>>>>
>>>>> which would be okay, except that the second variable always comes in as
>>>>> blank.
>>>>>
>>>>> Does anyone know of a solution for this?
>>>>>
>>>>> Thanks in advance,
>>>>> Dan
>>>>>
>>>>> *
>>>>> *   For searches and help try:
>>>>> *   http://www.stata.com/help.cgi?search
>>>>> *   http://www.stata.com/support/statalist/faq
>>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>>
>>>>
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/statalist/faq
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>
>>>
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Insheeting Japanese
  - From: "Sergiy Radyakin" <[email protected]>

References:
- st: Insheeting Japanese
  - From: "Dan Weitzenfeld" <[email protected]>
- Re: st: Insheeting Japanese
  - From: "Sergiy Radyakin" <[email protected]>
- Re: st: Insheeting Japanese
  - From: "Dan Weitzenfeld" <[email protected]>
- Re: st: Insheeting Japanese
  - From: "Dan Weitzenfeld" <[email protected]>
- Re: st: Insheeting Japanese
  - From: Steven Samuels <[email protected]>

Prev by Date: st: RE: what's wrong with this statement?
Next by Date: Re: st: Insheeting Japanese
Previous by thread: Re: st: Insheeting Japanese
Next by thread: Re: st: Insheeting Japanese
Index(es):
- Date
- Thread