Ferret Versions Save

Declarative web scraping

v0.8.1

4 years ago

Fixed

  • Added existence check to CLICK and CLICK_ALL functions. #341
  • Added a check whether an element is in the viewport before scrolling. #342

v0.8.0

4 years ago

Added

  • Delay randomization for inputs. #283
  • Namespace support. #269
  • iframe support. #315
  • Better emulation of user interaction. #316, #331
  • ESCAPE_HTML, UNESCAPE_HTML and DECODE_URI_COMPONENT functions. #318
  • XPath support. #322
  • Regular expression operator. #326
  • INNER_HTML_SET and INNER_TEXT_SET functions. #329
  • Possibility to set viewport size. #334
  • FOCUS function. #340

Changed

  • RAND accepts optional upper and lower limits. #271
  • Updated CDP definitions. #328
  • Logic of iterator termination. #330

Fixed

  • Order of arguments in SCROLL function. #269
  • The command line parameter "--param" does not support colon. #282
  • Race condition during WAIT_NAVIGATION call. #281
  • Arithmetic operators. #298
  • Missed UA setting for HTTP driver. #318
  • Improper math operator used in calculating page load timeout. #319
  • Wrong function names in README. #321
  • JSON serialization for HTTPHeader type. #323

v0.7.0

5 years ago

Added

  • Autocomplete to CLI #219.
  • New mouse functions - MOUSE(x, y) and SCROLL(x, y) #237.
  • WAIT_NO_ELEMENT, WAIT_NO_CLASS and WAIT_NO_CLASS_ALL functions #249.
  • Computed HTMLElement.style property #255.
  • ATTR_GET, ATTR_SET, ATTR_REMOVE, STYLE_GET, STYLE_SET and STYLE_REMOVE functions #255.
  • WAIT_STYLE, WAIT_NO_STYLE, WAIT_STYLE_ALL and WAIT_NO_STYLE_ALL functions #256.
  • Cookies support. Now a document can be loaded with preset cookies. Also, HTMLDocument has .cookies property. In order to manipulate with cookies, COOKIE_DEL, COOKIE_SET AND COOKIE_GET functions were added #242.
LET doc = DOCUMENT(url, {
    driver: "cdp",
    cookies: [{
        name: "x-e2e",
        value: "test"
    }, {
        name: "x-e2e-2",
        value: "test2"
    }]
})

Changed

  • Renamed ParseTYPEP to MustParseTYPE #231.
  • Added context to all HTML object #235.

Fixed

  • Click events are not cancellable #222.
  • Name collision #223.
  • Invalid return in FQL Compiler constructor #227.
  • Incorrect string length computation #238.
  • Access to HTML object properties via dot notation #239.
  • Graceful process termination #240.
  • Browser launcher for macOS #246.

Breaking changes

  • New runtime type system #232.
  • Moved and renamed collections.IterableCollection and collections.CollectionIterator interfaces. Now they are in core package and called Iterable and Iterator 1af8b37.
  • Renamed collections.Collection interface to collections.Measurable 1af8b37.
  • Moved html interfaces from runtime/values package into drivers package #234.
  • Changed drivers initialization. Replaced old drivers.WithDynamic and drivers.WithStatic methods with a new drivers.WithContext method with optional parameter drivers.AsDefault() #234.
  • New document load params #234.
LET doc = DOCUMENT(url, {
    driver: "cdp"
})

v0.6.0

5 years ago

Added

  • Added support for context.Done() to interrupt an execution #201.
  • Added support for custom HTML drivers #209.
  • Added support for dot notation access and assignments for custom types #214
  • Added ELEMENT_EXISTS(doc, selector) -> Boolean function #210.
LET exists = ELEMENT_EXISTS(doc, ".nav")
  • Added PageLoadParams to DOCUMENT function #214.
LET doc = DOCUMENT("https://www.google.com/", {
    dynamic: true,
    timeout: 10000
})

Fixed

  • Math operators precedence #202.
  • Memory leak in DOWNLOAD function #213.

Breaking change

  • (Embedded) Removed builtin drivers initialization in Program #198. The initialization must be done via context manually.

v0.5.2

5 years ago

Fixed

  • Does not close a browser tab when it fails to load a page #193.
  • HTMLElement.value does not return an actual element value #195
  • Compiles a query with a duplicate variable in FOR statement #196.
  • Default CDP address #197.

v0.5.1

5 years ago

Fixed

  • Unable to change a page load timeout #186.
  • RETURN doc returns an empty string #187.
  • Unable to pass an HTML Node without a selector to INNER_TEXT and INNER_HTML #187.
  • doc.innerText returns an error #187.
  • Panics when WAIT_CLASS does not receive all required arguments #192.

v0.5.0

5 years ago

Added

Fixed

  • Unable to define variables and make function calls before FILTER, SORT and etc statements #148.
  • Unable to use params in LIMIT clause #173.
  • RIGHT should return substr counting from right rather than left #164.
  • INNER_HTML returns outer HTML instead for dynamic elements #170.
  • INNER_TEXT returns HTML instead from dynamic elements #170.

Breaking change:

  • Name collision between math and utils packages in standard library. Renamed LOG to PRINT #162.

v0.4.0

5 years ago

Added

  • COLLECT keyword #141
  • VALUES function #128
  • MERGE_RECURSIVE function #140

Fixed

  • Unable to use string literals as object properties commit

v0.3.0

5 years ago

Added

Fixed

  • KEEP function does not perform deep cloning commit
  • WaitForNavigation callback can get called more than once commit
  • Concurrent map iteration and map write commit

Breaking changes

  • Renamed .innerHtml to .innerHTML commit

v0.2.0

5 years ago

Changelog

Added