Addition of mode "fastload", as well as support for 2-byte and 4-byte precision#13
Addition of mode "fastload", as well as support for 2-byte and 4-byte precision#13leonbohmann merged 7 commits intoleonbohmann:devfrom
Conversation
# Added a 'fastload' option, which reads the channel data to a numpy array. # Added support for reading measurement data with 2-byte and 4-byte precision - added the method 'read_float()' - renamed the method 'read_single()' to 'read_byte()' to avoid confusion. # Added test files with data at the different precision levels.
The code now sets the attribute "ExportFormat" to zero instead of throwing away the entire extended header.
leonbohmann
left a comment
There was a problem hiding this comment.
If I understand correctly, "fastload" will instruct the Channel to load its data using Numpys Implementation of np.fromfile?
Do you think it would be a good idea to make fastload the default way of loading data? I mean if it does the same thing only faster... If so, you could change the default value for it.
leonbohmann
left a comment
There was a problem hiding this comment.
Ok so all in all this seems fine. Only thing I don't get yet is the scale factor..
|
I am merging this into a new branch "dev" which I will use to develop the future release. Just to keep things organized! |
|
The fastload option actually breaks the conversion into json format, because when using fastload it created the data objects of the channels as |
|
Hi Leon, you could use a condition like "type(data) is ndarray" or "isinstance(data, ndarray)" to check if the data is stored in a numpy array, and if True, apply a statement like "data_to_write = data.tolist()" or "data_to_write = list(data)" before writing to json or csv. For numerical data like we have here, however, it's usually better to use a binary format when writing to disk, as this requires a lot less storage space and makes for faster reading and writing of files. One suggestion here would be to implement a save method which uses the "pickle" module to pickle the whole object. This could be useful for users who want to use Python for further processing. One could also implement a method which writes the measurement data to parquet and the metadata to json. This would make the output more portable. When working with homogeneous numerical data (where all values in the data structure are of the same datatype), numpy arrays are usually orders of magnitude faster than lists. The numpy and scipy libraries have also implemented heaps of functions which are optimized for just this data structure. You could therefore consider converting the measurement data to an ndarray also when not using fastload. I think your filtering function "lfilt()" might also return an ndarray, but I'm not sure. |
|
Yes you are right about that. Maybe I'll just overhaul the data datatype completely and switch to ndarray. Then it'll be upon the user to decide on how to save it. This package should focus on reading the data only, most users will probably create theirnown plots and files eitherway... |
|
In that case, the pickle module would be a perfect fit. It allows you to dump an item in your workspace to file with only a few lines of code. The file can then just as easily be loaded into the workspace again in a later Python session for further processing. See a code example below (excuse my Norwegian code): `def lag_pickle(mappe_lagre,objekt,filnavn): def hent_pickle(mappe_last,filnavn): |
|
By the way, I've found a bug with the fastload mode which occurs when the file has fewer datapoints than is indicated in the header. The regular mode raises an IndexError there, but fastload doesn't, and produces gibberish instead. I'll try and fix it. |
That will be a problem also for the reading using the original method. Therefor we should consider some error handljng to prevent the code failing using both methods.. |
Yes true. But I think this package should then only be used to convert the binary data to some ndarray in python. The seconds step will be up to the user. While using the package myself, I realised that I tend to make a lot of changes in the package just so it fits my needs. It'll be more efficient, if we keep things and responsibilites simple, I think! |
|
New version is released containing your changes. I think the external header data is really helpful as well so I included that into the readme! |
|
hello leon, is it possible to convert the read file into .xlxs? |
|
For questions and feature request please create a new issue. Surely it is possible, but unfortunately that functionality is not part of this package. I did a quick search and found out, that you can convert a pandas dataframe to excel. For that, you would have to convert the channels to a dataframe first. The other option would be to save the data as a csv file, you can simply open that with excel directly and save it as xlsx from there! |
Kind of a big pull request, introducing two features at once. They've been implemented in such a way as to not interfere with existing functionality.