G06F16/986

A Transferable Neural Architecture for Structured Data Extraction From Web Documents
20230014465 · 2023-01-19 ·

Systems and methods for efficiently identifying and extracting machine-actionable structured data from web documents are provided. The technology employs neural network architectures which process the raw HTML content of a set of seed websites to create transferable models regarding information of interest. These models can then be applied to the raw HTML of other websites to identify similar information of interest. Data can thus be extracted across multiple websites in a functional, structured form that allows it to be used further by a processing system.

DYNAMIC WEB PAGE CLASSIFICATION IN WEB DATA COLLECTION

The current application discloses processor-implemented methods and systems of processing unclassified HTML responses collected in the context of a data collection service, the method comprising, in one embodiment, receiving unclassified HTML documents, isolating elements relevant for category identification, deriving classification attributes from the isolated elements, and applying a Machine Learning-based classification model resulting in HTML data items classified and labelled accordingly. In certain embodiments the Machine Learning model may be a model trained on a pre-created training data set labeled manually or in an automatic fashion.

FRAMEWORK FOR EXPOSING CONTEXT-DRIVEN SERVICES WITHIN A WEB BROWSER

Systems and methods for securely exposing context-driven services within a web browser. An example method includes receiving manifests from hubs apps (e.g., remote services). The manifests define requested context types for the hub apps. When the web browser loads a web page, the web browser may execute context extractors to extract context from the web page. The context extractors that are executed are based on the context types requested by the hub apps. The extracted context is then sent to the corresponding hub apps without providing the hub apps direct access to the web page. For instance, the hub apps do not have access to the document object model (DOM) of the web page and the hub apps cannot inject data into the web page.

Method and system for detecting slow page load
11550870 · 2023-01-10 · ·

A method and system for detecting slow page load is provided. An example system comprises a page request detector, a time-out module, a time-out monitor, and a lightweight page requestor. The page request detector may be configured to detect a request for a web page. The time-out module may be configured to commence a time-out period in response to a request for a web page. The time-out module cooperates with the time-out monitor that may be configured to determine that rendering of a rich version of the requested web page has not commenced at an expiration of the time-out period. The lightweight page requestor may be configured to cause a lightweight version of the requested page to be provided to the client system when the time-out monitor determines that the rendering of a rich version of the requested web page has not commenced at an expiration of the time-out period.

AUTOMATIZED PARSING TEMPLATE CUSTOMIZER
20230214588 · 2023-07-06 · ·

Systems and methods to intelligently adapt parsing rules according to the layout changes occurring in multiple targets are disclosed. Specifically, the disclosure provides a solution to detect the layout changes in a target domain and to update parsing templates or parsing rules. The disclosed embodiments in one aspect describe methods and systems to receive and store parsing templates or parsing rules and monitoring tables or a list of related URLs within an internal storage facility. Methods and systems to scrape and parse data by following parsing rules or using parsing templates. The methods and systems describe the manner in which the parsed data and the actual data are analyzed to detect any changes in the layout of the target domain(s). The methods and systems give details on how to decide whether to update parsing rules or parsing templates depending on the layout changes in the target domains.

Web-based medical image viewer with web database
11550869 · 2023-01-10 · ·

Methods and systems for rending medical images within a web browser application. The web browser application retrieves a worklist and automatically determines an image study from the worklist to be cached. The web browser application retrieves at least one medical image included in the image study. The web browser application creates a web database for storing the at least one medical image within the browser application. When a user selects a medical image for display within the web browser, the web browser application determines whether the medical image is stored in the web database. When the medical image is stored in the web database, the web browser application retrieves the medical image from the web database. When the medical image is not stored in the web database, the web browser application retrieves the medical image from a remote image repository.

Method and apparatus for HTML construction using the widget paradigm
11553029 · 2023-01-10 · ·

A method and apparatus for building and delivering a HTML (Hypertext Markup Language) construction representing digital content layout are disclosed herein. In one embodiment, the method includes constructing the digital content layout by selecting a delivery method, selecting subwidgets, and using the selected subwidgets to build and represent the digital content layout, wherein the delivery method is an inline, and wherein the subwidgets include images, texts, or videos. The method also includes integrating the digital content layout with a designated website. The method further includes delivering third-party content to the designated website. In addition, the method includes tracking interactive content on the designated website.

Privacy trustworthiness based API access
11550937 · 2023-01-10 · ·

A method may include providing access to a first application programming interface (API) provided by a first party and a second API provided by a second party. The method may also include collecting a first set of API data sources related to the first API and a second set of API data sources related to the second API. The method may additionally include using a deep learning model to predict a privacy trustworthiness level for the first API and the second API, and disabling access to the first API based on the privacy trustworthiness level of the first API being below a threshold level.

Machine first approach for identifying accessibility, non-compliances, remediation techniques and fixing at run-time

Accessibility in software engineering is treated as expensive, time consuming and hence adoption of accessibility, is a challenge despite stringent timelines and regulatory requirements published around the world. Moreover, cost of implementing accessibility increases project cost due to manual intervention and dependency on niche skills, which is scarce in industry. Embodiments of the present disclosure provide system and method for automated identification of applicable accessibility guidelines and determination of remediation techniques for fixing issues in webpages, wherein webpages are rendered, and applicable accessibility guidelines are identified based on user interface elements (UI) comprised in the webpages. Further, content associated with rendered webpages are analyzed using the applicable accessibility applicable guidelines to identify webpage non-compliance issue(s) and remediation technique(s) thereof. Fixes for the non-compliance issues are determined based on the webpages and applied on a document object model (DOM) based on a current state associated with the webpage and/or UI elements.

Interactive web application editor

Interactive editing of a web application at a user end station is described. The user end station dynamically loads into a running instance of the web application an interactive editor that allows editing of one or more user interface (UI) components of that running instance of the web application. A selection of a DOM element of the web application is received and a UI component that corresponds to the selected DOM element is determined. A set of parameters associated with the determined UI component is also determined. A value editor is displayed that is configured to display for at least one of the set of parameters a value and allows for that value to be modified. A modification of at least the value of the at least one of the set of parameters is received and the running instance of the web application is updated to reflect the modified value.