Addition of mode "fastload", as well as support for 2-byte and 4-byte precision by hakonbar · Pull Request #13 · leonbohmann/APReader

hakonbar · 2022-04-21T12:43:40Z

Kind of a big pull request, introducing two features at once. They've been implemented in such a way as to not interfere with existing functionality.

Added a 'fastload' mode, which takes advantage of the fact that consecutive data points in a measurement channel are stored as a contiguous "byte chunk" in the catman binary format instead of blockwise. You therefore only need to pass a pointer to the first byte as well as the length of the chunk.
Added the method "Channel.readExtHeader", in order to get at the attribute "ExportFormat". This attribute indicates the byte depth or precision of the measurement file, allowing the algorithm to differentiate.
Added the method "BinaryReader.read_float", which reads in 4-byte floating point numbers.
Changed the name of the method "read_single" to "read_byte" to avoid confusion with the newly added method.
Added some sample data from HBK with 2-, 4- and 8-byte data.

# Added a 'fastload' option, which reads the channel data to a numpy array. # Added support for reading measurement data with 2-byte and 4-byte precision - added the method 'read_float()' - renamed the method 'read_single()' to 'read_byte()' to avoid confusion. # Added test files with data at the different precision levels.

The code now sets the attribute "ExportFormat" to zero instead of throwing away the entire extended header.

leonbohmann

If I understand correctly, "fastload" will instruct the Channel to load its data using Numpys Implementation of np.fromfile?

Do you think it would be a good idea to make fastload the default way of loading data? I mean if it does the same thing only faster... If so, you could change the default value for it.

apread/entries.py

leonbohmann

Ok so all in all this seems fine. Only thing I don't get yet is the scale factor..

leonbohmann · 2022-04-27T19:03:30Z

I am merging this into a new branch "dev" which I will use to develop the future release. Just to keep things organized!

leonbohmann · 2022-05-19T12:26:30Z

The fastload option actually breaks the conversion into json format, because when using fastload it created the data objects of the channels as ndarray which is not marked as json-serializable.

hakonbar · 2022-05-19T18:20:15Z

Hi Leon, you could use a condition like "type(data) is ndarray" or "isinstance(data, ndarray)" to check if the data is stored in a numpy array, and if True, apply a statement like "data_to_write = data.tolist()" or "data_to_write = list(data)" before writing to json or csv.

For numerical data like we have here, however, it's usually better to use a binary format when writing to disk, as this requires a lot less storage space and makes for faster reading and writing of files. One suggestion here would be to implement a save method which uses the "pickle" module to pickle the whole object. This could be useful for users who want to use Python for further processing. One could also implement a method which writes the measurement data to parquet and the metadata to json. This would make the output more portable.

When working with homogeneous numerical data (where all values in the data structure are of the same datatype), numpy arrays are usually orders of magnitude faster than lists. The numpy and scipy libraries have also implemented heaps of functions which are optimized for just this data structure. You could therefore consider converting the measurement data to an ndarray also when not using fastload. I think your filtering function "lfilt()" might also return an ndarray, but I'm not sure.

leonbohmann · 2022-05-19T22:35:38Z

Yes you are right about that. Maybe I'll just overhaul the data datatype completely and switch to ndarray. Then it'll be upon the user to decide on how to save it.

This package should focus on reading the data only, most users will probably create theirnown plots and files eitherway...

hakonbar · 2022-05-20T11:33:08Z

In that case, the pickle module would be a perfect fit. It allows you to dump an item in your workspace to file with only a few lines of code. The file can then just as easily be loaded into the workspace again in a later Python session for further processing. See a code example below (excuse my Norwegian code):

`def lag_pickle(mappe_lagre,objekt,filnavn):

  if not filnavn.endswith('.pkl'):
      from pathlib import Path
      filnavn = Path(filnavn).stem + '.pkl'

  with open(os.path.join(mappe_lagre,filnavn), 'wb') as outp:
      pickle.dump(objekt, outp, pickle.HIGHEST_PROTOCOL)

def hent_pickle(mappe_last,filnavn):

  with open(os.path.join(mappe_last,filnavn), 'rb') as inp:
      objekt = pickle.load(inp)
      
  return objekt`

hakonbar · 2022-05-20T11:34:22Z

By the way, I've found a bug with the fastload mode which occurs when the file has fewer datapoints than is indicated in the header. The regular mode raises an IndexError there, but fastload doesn't, and produces gibberish instead. I'll try and fix it.

leonbohmann · 2022-05-20T15:36:13Z

By the way, I've found a bug with the fastload mode which occurs when the file has fewer datapoints than is indicated in the header. The regular mode raises an IndexError there, but fastload doesn't, and produces gibberish instead. I'll try and fix it.

That will be a problem also for the reading using the original method. Therefor we should consider some error handljng to prevent the code failing using both methods..

leonbohmann · 2022-05-20T15:38:26Z

In that case, the pickle module would be a perfect fit....

Yes true. But I think this package should then only be used to convert the binary data to some ndarray in python. The seconds step will be up to the user.

While using the package myself, I realised that I tend to make a lot of changes in the package just so it fits my needs. It'll be more efficient, if we keep things and responsibilites simple, I think!

leonbohmann · 2022-05-22T17:34:06Z

New version is released containing your changes. I think the external header data is really helpful as well so I included that into the readme!

LarissaPestana · 2023-05-24T16:57:20Z

hello leon, is it possible to convert the read file into .xlxs?

leonbohmann · 2023-05-26T10:38:09Z

For questions and feature request please create a new issue.

Surely it is possible, but unfortunately that functionality is not part of this package. I did a quick search and found out, that you can convert a pandas dataframe to excel. For that, you would have to convert the channels to a dataframe first.

The other option would be to save the data as a csv file, you can simply open that with excel directly and save it as xlsx from there!

hakonbar added 5 commits April 14, 2022 10:15

Parameter names. Error handling in reading of extended header.

e1d95cc

Changed error handling when reading extended header

0c4d04f

The code now sets the attribute "ExportFormat" to zero instead of throwing away the entire extended header.

Updated descriptive text on method "entries.readExtHeader"

80a5894

Fixed some typos in the comments

58e5423

leonbohmann reviewed Apr 26, 2022

View reviewed changes

apread/entries.py Show resolved Hide resolved

leonbohmann approved these changes Apr 26, 2022

View reviewed changes

leonbohmann reviewed Apr 26, 2022

View reviewed changes

leonbohmann added question Further information is requested enhancement New feature or request and removed question Further information is requested labels Apr 26, 2022

hakonbar added 2 commits April 27, 2022 09:05

Added catman binary format specification

47ca3e5

Set 'fastload' with numpy as the default mode

723b143

leonbohmann approved these changes Apr 27, 2022

View reviewed changes

leonbohmann changed the base branch from master to dev April 27, 2022 19:03

leonbohmann merged commit f37cc49 into leonbohmann:dev Apr 27, 2022

leonbohmann mentioned this pull request Apr 27, 2022

Support for 32-bit and 16-bit precision #12

Closed

Conversation

hakonbar commented Apr 21, 2022

Uh oh!

leonbohmann left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

leonbohmann left a comment

Choose a reason for hiding this comment

Uh oh!

leonbohmann commented Apr 27, 2022

Uh oh!

leonbohmann commented May 19, 2022

Uh oh!

hakonbar commented May 19, 2022

Uh oh!

leonbohmann commented May 19, 2022

Uh oh!

hakonbar commented May 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hakonbar commented May 20, 2022

Uh oh!

leonbohmann commented May 20, 2022

Uh oh!

leonbohmann commented May 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leonbohmann commented May 22, 2022

Uh oh!

LarissaPestana commented May 24, 2023

Uh oh!

leonbohmann commented May 26, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hakonbar commented May 20, 2022 •

edited

Loading

leonbohmann commented May 20, 2022 •

edited

Loading