read in data (txt file) with special time format

Here is a solution to the issue.

I recommend saving this as an Igor Procedure file so that it will be automatically loaded when you launch Igor. Execute this for details:
DisplayHelpTopic "Special Folders"

If this does not work with your file then it is probably due to formatting subtleties. In that case, attach a zip archive of your file so I can see exactly what the file contains.

// This loads a file containing data like this:

//  dd.mm.yyyy<space>hh.mm.ss.ff<space> ...

// where hh.mm.ss.ff is a time of day with fractional seconds. The procedure changes the

// dots in the time to colons so that it works with LoadWave/J, writes a temporary file containing

// the massaged text, and loads the data from the temporary file.

// There is a DeleteFile call at the end of the procedure to delete the temporary file. If you want to

// inspect the temporary file, comment out the DeleteFile call.

// In addition to the issue of the formatting of the time, the file has another peculiarity as posted

// at http://www.igorexchange.com/node/7831. It looks like this:

// Datum Zeit Tmp_ak WGms WR DatumZeit

// 17.09.2016 09.50.00.02 15.8 1.4 248 17.09.2016 09.50.00.02

// The column names for the first date/time have a space between them but the column name for the last

// date/time has no space. To sidestep this issue, I am ignoring the column names altogether and

// using LoadWave /B to set the column names.

// There is another problem. The use of space to separate the date and time as well as to separate one column

// from the next creates problems with LoadWave. Therefore the procedure replaces the space used to separate

// one column from the next with a tab.

Menu "Load Waves"

    "Load Kerstin File...", LoadKerstinFile("", "")

End

static Function/S FixText(textIn)

    String textIn

    String textOut = ""

    Variable numBytes = strlen(textIn)

    Variable bytesLeft = numBytes

    Variable offset = 0

    for(offset=0; offset<numBytes; )

        String ch = textIn[offset]

        Variable isSpace = CmpStr(ch," ")==0

        if (isSpace && bytesLeft>=12)               // Space and sufficient remaining bytes to be hh.mm.ss.ff?

            String section = textIn[offset+1,offset+11]

            String regExp = "[[:digit:]]{2}\.[[:digit:]]{2}\.[[:digit:]]{2}\.[[:digit:]]{2}"    // hh.mm.ss.ff

            if (GrepString(section,regExp))

                section = ReplaceString(".", section, ":")

                textOut += ch + section

                Variable sectionLength = strlen(section)

                offset += 1 + sectionLength                 // 1 for ch (space)

                bytesLeft -= 1 + sectionLength

                continue

            endif

        endif

        if (isSpace)

            ch = "\t"                               // Replace space separating columns with tab

        endif

        textOut += ch

        offset += 1

        bytesLeft -= 1

    endfor

    return textOut

End

//  LoadKerstinFile(pathName, fileName)

//  A data file has unwanted line breaks for lines longer than 80 characters

//  This routine creates a temporary version of the file without the bad line breaks

//  and loads data from the temporary file.

Function LoadKerstinFile(pathName, fileName)

    String pathName     // Name of an Igor symbolic path or "".

    String fileName         // Name of file or full path to file.

    Variable refNum

    // First get a valid reference to a file.

    if ((strlen(pathName)==0) || (strlen(fileName)==0))

        // Display dialog looking for file.

        // Replace /T="????" with, for example, /T=".dat" if your files are .dat files.

        Open /D /R /P=$pathName /T="????" refNum as fileName

        fileName = S_fileName           // S_fileName is set by Open/D

        if (strlen(fileName) == 0)      // User cancelled?

            return -1

        endif

    endif

    // Open source file and read the raw text from it into a string variable

    Open/Z=1/R/P=$pathName refNum as fileName

    if (V_flag != 0)

        return -1                       // Error of some kind

    endif

    FStatus refNum                      // Sets V_logEOF

    Variable numBytesInFile = V_logEOF

    String text = PadString("", numBytesInFile, 0x20)

    FBinRead refNum, text               // Read entire file into variable.

    Close refNum

    // Fix the text

    text = FixText(text)                    // Remove bad line breaks

    // Write the fixed text to a temporary file

    String tempFileName = fileName + ".noindex" // Use of .noindex prevents Spotlight from indexing the file. Otherwise we get an error when we try to delete the file because Spotlight has it open.

    Open refNum as tempFileName

    FBinWrite refNum, text

    Close refNum

    String columnInfoStr = ""

    columnInfoStr += "N=DatumZeitA,F=8;"        // <date><space><time>

    columnInfoStr += "N=TmpAk,F=0;"         // <number>

    columnInfoStr += "N=WGms,F=0;"          // <number>

    columnInfoStr += "N=WR,F=0;"                // <number>

    columnInfoStr += "N=DatumZeitB,F=8;"        // <date><space><time>

    // Load the temporary file

    // The /L flag causes LoadWave to load the data starting from line 1 (zero-based), skipping the name line which is problematic.

    // The /B flag specifies the column names and formats.

    // The /R flag specifies the date format as dd.mm.yy.

    LoadWave/J/D/P=$pathName/E=1/L={0,1,0,0,0}/B=columnInfoStr/K=0/R={English,2,2,2,2,"DayOfMonth.Month.Year",40} tempFileName

    if (V_flag == 0)

        Printf "An error occurred while loading data from \"%s\"\r", S_fileName

    else

        Printf "Loaded data from \"%s\"\r", S_fileName

    endif

    // Make table bigger

    MoveWindow 10, 50, 800, 400

    // Set date format to show fractional seconds

    ModifyTable showFracSeconds[1]=1,digits[1]=2, width[1]=175

    ModifyTable showFracSeconds[5]=1,digits[5]=2, width[5]=175

    // Delete the temporary file    

    DeleteFile /P=$pathName tempFileName

    return 0

End

Log in or register to post comments

July 10, 2017 at 10:02 am - Permalink

zek

Thanks for your help!
The procedure works, :-).
Not very fast, probably of the big data file.
I attached one of my data file.

Best regards
Kerstin

Attachments WindTestFile.zip

Log in or register to post comments

July 14, 2017 at 02:32 am - Permalink

_sk

zek wrote:
Dear all,
I try to read in a file like this

Datum Zeit Tmp_ak WGms WR DatumZeit
17.09.2016 09.50.00.02 15.8 1.4 248 17.09.2016 09.50.00.02
17.09.2016 09.50.00.12 15.7 1.3 253.6 17.09.2016 09.50.00.12
...

Why do you have duplicate columns in the data file? (Datum, Zeit) in the beginning, (Datum, Zeit) at the end? Each line begins and ends with this tuple. For the file that you posted if you remove the last two columns, the reduction is substantial: 70173015 bytes > 42524260 bytes (~67MB > ~41MB).

Under a linux or cygwin environment you'd do something like:
cat wind_test_file.txt | awk '{print $1,$2,$3,$4,$5}' > wind_test_file_out

Another recommendation is to change the delimiter to a comma. Then, use the Data > Load Waves > Load delimited text... menu, to load the data keeping the time as a text. Parse it after loading.

Under linux or cygwin in vim you'd issue the following to substitute space for comma:
:1,$s/ /,/g

Another recommendation is to combine the date and time into one variable ala julian date, unix time, etc. This will provide further savings on the file size, but will increase processing upon file loading.

edit: As a matter of fact, you don't need the date. It's a waste of bytes. Tell the person generating these files to name the filenames with the corresponding date, or put it in the beginning of each file as a header or something, and dump data only for this particular day in this particular file. I would also use fractional time. Even though a delta t would be enough (which btw is not precisely 100ms, just a tiny bit less, ~99.99(93)ms)

best,
_sk

Log in or register to post comments

July 14, 2017 at 06:24 am - Permalink