Últimas publicaciones y patentes sobre grandes modelos lingüísticos (LLM)

Grandes modelos lingüísticos (LLM)

Consejo: además de esta selección sobre Aeronáutica, puede buscar y filtrar nuestros:
    * herramienta gratuita de búsqueda de publicaciones * por autor, tema, palabras clave, fecha o revista.
    * herramienta gratuita de búsqueda de patentes * para patentes en inglés de la Oficina Europea de Patentes.

This is our latest selection of worldwide publications and patents in english on Large Language Models (LLM), between many scientific online journals, classified and focused on large language model, LLM, generative pre-trained transformer, pre-training, transformer architecture, gradient descent, GPT, tokenization, generative model, self-attention mechanism, masked language model and MLM.

Graph-to-Text Generation with Bidirectional Dual Cross-Attention and Concatenation

Published on 2025-03-11 by Elias Lemuye Jimale, Wenyu Chen, Mugahed A. Al-antari, Yeong Hyeon Gu, Victor Kwaku Agbesi, Wasif Feroze, Feidu Akmel, Juhar Mohammed Assefa, Ali Shahzad @MDPI

Abstract: Graph-to-text generation (G2T) involves converting structured graph data into natural language text, a task made challenging by the need for encoders to capture the entities and their relationships within the graph effectively. While transformer-based encoders have advanced natural language processing, their reliance on linearized data often obscures the complex interrelationships in graph structures, leading to structural loss. Conversely, graph attention networks excel at capturing graph struc[...]


Our summary: Proposal of a novel mechanism for integrating transformer-based and graph attention encoders to improve graph-to-text generation tasks, achieving higher BLEU and METEOR scores on benchmark datasets, showcasing potential for future research.

Graph-to-text generation, Bidirectional Dual Cross-Attention, Concatenation, Transformer-based encoders

Publication

Named Entity Recognition in Online Medical Consultation Using Deep Learning

Published on 2025-03-11 by Ze Hu, Wenjun Li, Hongyu Yang @MDPI

Abstract: Named entity recognition in online medical consultation aims to address the challenge of identifying various types of medical entities within complex and unstructured social text in the context of online medical consultations. This can provide important data support for constructing more powerful online medical consultation knowledge graphs and improving virtual intelligent health assistants. A dataset of 26 medical entity types for named entity recognition for online medical consultations is fi[...]


Our summary: Named entity recognition in online medical consultation addresses the challenge of identifying medical entities in unstructured text, constructing knowledge graphs, and improving virtual health assistants. The proposed deep learning approach outperforms the state-of-the-art method in identifying 26 medical entity types with an average F1 score of 85.47%, supporting real-time intelligent medical decision-making.

Named Entity Recognition, Online Medical Consultation, Deep Learning, Knowledge Graphs

Publication

Large Language Model-Guided SARSA Algorithm for Dynamic Task Scheduling in Cloud Computing

Published on 2025-03-11 by Bhargavi Krishnamurthy, Sajjan G. Shiva @MDPI

Abstract: Nowadays, more enterprises are rapidly transitioning to cloud computing as it has become an ideal platform to perform the development and deployment of software systems. Because of its growing popularity, around ninety percent of enterprise applications rely on cloud computing solutions. The inherent dynamic and uncertain nature of cloud computing makes it difficult to accurately measure the exact state of a system at any given point in time. Potential challenges arise with respect to task sched[...]


Our summary: Large Language Model-Guided SARSA Algorithm improves task scheduling in cloud computing by enhancing SARSA learning with LLM heuristics, reducing bias and improving performance. Mathematical modeling and experimental results validate the effectiveness of the proposed approach.

Large Language Model, SARSA Algorithm, Dynamic Task Scheduling, Cloud Computing

Publication

Enhancing Bottleneck Analysis in Ship Manufacturing with Knowledge Graphs and Large Language Models

Published on 2025-03-10 by Yanjun Ma, Tao Wu, Bin Zhou, Xiaoyang Liang, Jiwang Du, Jinsong Bao @MDPI

Abstract: Ship manufacturing is a critical backbone industry in China, where the nation leads on a global scale in terms of vessel completions and order volumes. However, the high volume of orders often imposes substantial processing loads, increases the risk of equipment failures, and exacerbates production bottlenecks. Despite the accumulation of significant amounts of data in this field, analyzing bottlenecks remains a persistent challenge, primarily due to the presence of heterogeneous, multi-source d[...]


Our summary: Enhancing bottleneck analysis in ship manufacturing through the use of knowledge graphs and large language models, addressing challenges in data integration and deep analysis, and improving manufacturing efficiency.

Knowledge Graphs, Large Language Models, Bottleneck Analysis, Ship Manufacturing

Publication

Can LLMs Be Good Evaluators in Creative Writing Tasks?

Published on 2025-03-10 by Sungeun Kim, Dongsuk Oh @MDPI

Abstract: The evaluation of creative writing has long been a complex and subjective process, made even more intriguing by the rise of advanced Artificial Intelligence (AI) tools like Large Language Models (LLMs). This study evaluates the potential of LLMs as reliable and consistent evaluators of creative texts, directly comparing their performance with traditional human evaluations. The analysis focuses on key creative criteria, including fluency, flexibility, elaboration, originality, usefulness, and spe[...]


Our summary: Evaluation of LLMs as evaluators in creative writing tasks, Comparison with human evaluations, Limitations and strengths of LLMs and human evaluators.

Large Language Models, Creative Writing, Evaluators, Artificial Intelligence

Publication

Multi-Channel Speech Enhancement Using Labelled Random Finite Sets and a Neural Beamformer in Cocktail Party Scenario

Published on 2025-03-08 by Jayanta Datta, Ali Dehghan Firoozabadi, David Zabala-Blanco, Francisco R. Castillo-Soria @MDPI

Abstract: In this research, a multi-channel target speech enhancement scheme is proposed that is based on deep learning (DL) architecture and assisted by multi-source tracking using a labeled random finite set (RFS) framework. A neural network based on minimum variance distortionless response (MVDR) beamformer is considered as the beamformer of choice, where a residual dense convolutional graph-U-Net is applied in a generative adversarial network (GAN) setting to model the beamformer for target speech enh[...]


Our summary: Multi-channel target speech enhancement using labeled random finite sets and a neural beamformer in cocktail party scenario. Proposed scheme based on deep learning architecture and assisted by multi-source tracking using a labeled random finite set framework. Explores the use of a neural network based on minimum variance distortionless response beamformer for target speech enhancement under reverberant conditions involving multiple moving speech sources.

Speech Enhancement, Multi-Channel, Neural Beamformer, Deep Learning

Publication

Improvements to language model watermarking

Patent published on the 2025-03-06 in WO under Ref WO2025050084 by VERANCE CORP [US] (Winograd Joseph [us])

Abstract: A method of embedding a watermark into an Al LLM output comprising inputting a text prompt to an LLM, the LLM generating a first token text output. Binary reduction is performed on the token text output and generating watermarks using a watermark-seeded random number generator. The watermarks are applied to successive bits of the character set representations of the output by making a decision about accepting a completed token using a second random number generator that is distinct from the firs[...]


Our summary: Method of embedding watermarks into an AI language model output by performing binary reduction on token text output and applying watermarks using a watermark-seeded random number generator with acceptance decision based on relative token probability. New model PDF generated for accepted tokens.

language model, watermarking, binary reduction, random number generator

Patent

Agentic based processing for carbon emissions auditing

Patent published on the 2025-03-06 in WO under Ref WO2025049513 by GEOQUEST SYSTEMS B V [NL] (Manikani Sunil [in], Sule Vishwanath [in], Hassan Abbi Moghaiyera [us], Milne Paul [gb])

Abstract: Carbon emission auditing includes obtaining a supplier transaction record of an enterprise corresponding to a supplier entity from a transaction repository. A research response corresponding to the supplier entity is obtained from a large language model (LLM). A validity of the supplier transaction record based on the research response is further obtained from the LLM as a validation response. Field values of the record fields of the supplier transaction record are further verified by the LLM, a[...]


Our summary: Agentic based processing for carbon emissions auditing involves obtaining supplier transaction records, validating them using a large language model, and generating audit failure explanations.

agentic, carbon emissions, auditing, language model

Patent

Table of Contents
    Añadir una cabecera para empezar a generar el índice

    DESIGN or PROJECT CHALLENGE?
    Mechanical Engineer, Project or R&D Manager
    Desarrollo eficaz de productos

    Available for a new challenge on short notice in France & Swiss.
    Póngase en contacto conmigo en LinkedIn
    Plastic & metal products, Design-to-cost, Ergonomics, Medium to high-volume, Regulated industries, CE & FDA, CAD, Solidworks, Lean Sigma Black Belt, medical ISO 13485 Class II & III

    Universidad ?
    Institución ?

    ¿Le gustaría convertirse en socio de este sitio alojándolo?
    > envíanos un mensaje <

    Temas tratados: Large Language Models, LLM, generative pre-trained transformer, pre-training, transformer architecture, gradient descent, GPT, tokenization, generative model, self-attention mechanism, masked language model, MLM, Knowledge Graphs, Artificial Intelligence, Speech Enhancement, Deep Learning, Transformer.

    es_ESES
    Scroll al inicio