iotools

About iotools
GIT access
Download/Files
News
Check results
Package R docs

News/Changelog

0.4-0   2025-02-19
    *   chunk.tapply() has been fully re-written and it takes advantage
        key-aware chunk reader. The sep= argument is ignored if the
        input is a chunk reader and defaults to "\t" otherwise.

    *   The implementation now works both with and without growabale
        vector API and across a wide range of R versions.

    *   remove superfluous rawToChar() on column names (#38)

0.3-5   2023-11-29
    *   add format casts

0.3-4   2023-11-28
    *   pass max.size argument through chunk.map() (#39)

    *   minor change to work around rchk not being able to follow
        protections across functions.

0.3-3   2022-12-09
    *   fix error/segfault (depending on R version) in as.output()
        when a type that doesn't support LENGTH() is passed (such as
        NULL).

    *   CH.MAX.SIZE was ignored in chunk.apply() for parallel jobs

    *   add CH.BINARY flag which can be set to TRUE if the merge step
        should be performed continually as a call to a binary CH.MERGE
        function instead of collecting all results and then calling
        CH.MERGE.

        Analogously, CH.INITIAL has been added which is a function
        called on the first result. If NULL then
        CH.MERGE(NULL, result) is called instead.

        Note: in previous versions regular chunk.apply() was behaving
        like CH.BINARY=FALSE, but when parallel was set then it
        behaved like CH.BINARY=TRUE. Now CH.BINARY is explicit.

    *   new parallel chunk.apply() implementation

        The related arguments have been re-named to avoid clashes with
        actual function arguments. CH.MERGE now behaves the same way
        as with sequential processing for consistency.

        CH.PARALLEL - if set to 2 or higher triggers parallel
                      processing of chunks
        CH.SEQUENTIAL - if FALSE then parallel processing is allowed
                      to change the order of the chunks to process
                      chunks yield results faster frist.

0.3-2   2021-07-23
    *   minor changes for compatibility with write-barrier
        and R-devel (no functional difference)

0.3-1   2020-03-10
    *   make sure connections are closed in examples so
        check doesn't complain

    *   add PROTECT() to chunk.apply() and string singletons

0.3-0   2020-03-09
    *   integers incorrectly parsed empty strings to 0
        instead of NA (#27)

    *   add as.output.raw() which supports both direct file
        descriptors and connections

    *   Extend the handling of as.output()
        as.output() now supports three modes:
          1) con=NULL: a raw vector is created
          2) con=connection: writes output to binary connection
          3) con=iotools.stderr/stdout/fd(fd): writes directly
             to a file descriptor

        Also as.output() is now pass-through for raw vectors.

        Finally, most methods now support keys to be either a
        logical value to suppress names/row names or it can
        also be a character vector in which case its content
        is used as keys.

0.2-6   2018-02-05
    *   add support for logical vectors in fdrbind

0.2-5   2018-01-24
    *   disable non-blocking raw fd reads on Windows since select()
        does NOT work on FDs there.

0.2-4   2017-04-13
    *   remove unnecessary reference to stdout

    *   increase tmeporary buffer to (hopefully) appease gcc7

    *   add stdout_writeBin C code

    *   add fdrbind()

0.2-3   2016-09-16
    *   fix a bug in timeout parameter of read.chunk() where subsecod
        timeouts were computed incorrectly

0.2-2   2016-04-26
    *   add support for raw file descriptors and timeout in the chunk
        reader

0.2-1   2015-08-20
    *   use R_GetConnection() API in R >=3.3.0

    *   add chunk.map to mimic hmr locally

    *   fix col.names handing in write.csv.raw() (#26)

    *   clean up as_output_matrix to be 64-bit safe

    *   use internal C methods for all output
        support ragged lists (with recycling) and long vectors in
        as_output_dataframe

    *   support I() to tag ojebcts that don't want to use
        as.character()

    *   make string coersion rules consistent

    *   re-factor as.output.data.frame to use dybuf

    *   support binary connection con in as.output() instead of
        buffering

    *   add support for quoting via quote= parameter (#25)

0.1-12  2015-07-28
    *   don't import parallel::mc* since it doesn't exist on Windows

0.1-11  2015-07-28
    *   fix issues, mostly convert to 64-bit

0.1-10  2015-06-22
    *   remove old stdio API

    *   add quoting to read.csv.raw

    *   support quotes in character fields (#24)

0.1-9   2015-06-22
    *   fix handing of Windows line endings (#23)

0.1-8   2015-03-18
    *   add support for iterators - imstrsplit/idstrsplit
        (Thanks to Mike Kane! - #19)

    *   add tests and fixes to make them run on edge cases

    *   fix mstrsplit when given length zero input

    *   re-factor as.output() to use dynamic buffers

0.1-7   2015-02-10
    *   add C implementations of as.output()

0.1-6   2015-02-08
    *   support tab/comma separated files with as.output() when x is a
        data.frame or matrix

    *   make loading hmr silently the default until we rename hmr and
        go to CRAN

    *   fix header=TRUE bug

    *   treat NAs in dstrsplit list input as a way to skip columns

0.1-5   2014-12-15
    *   Removed "pipeline" parameter for chunk.apply and updated the
        documentation

    *   Parallel option added to chunk.apply()

    *   major re-structuring of the raw parsers (dstrsplit and
        mstrsplit)

------------------------------------------------------------------------
  previous versions included code for Hadoop Map/Reduce, that code
  has now been moved to a separate package:
  https://github.com/s-u/hmr
------------------------------------------------------------------------

0.1-4   2014-06-09
    *   support names from colspec, support list colspec

    *   add experimental remote submission capability

0.1-3   2014-05-20
    *   add hadoop.opt option and hadoop.conf support

0.1-2
    *   fix missing PROTECT in chunk.tapply

0.1-1   2014-04-17
    *   add key-awareness when splitting

    *   add ctapply() - more efficient implementation of tapply()
        for contiguous keys

    *   add support for Hadoop 2.x

0.1-0   2013-05-23
    *   initial public release