
छोटे भाषा मॉडल उन ट्रांसफार्मर-आधारित प्राकृतिक भाषा प्रसंस्करण प्रणालियों को संदर्भित करते हैं जो लगभग 7 अरब पैरामीटर्स से नीचे काम करती हैं — यह एक ऐसी सीमा है जो औपचारिक परिभाषा से कहीं अधिक उपभोक्ता हार्डवेयर, मोबाइल डिवाइस और एम्बेडेड सिस्टम पर बिना क्लाउड इन्फेरेंस इन्फ्रास्ट्रक्चर के डिप्लॉयमेंट की व्यावहारिक बाधा से परिभाषित होती है।
यह क्षेत्र फ्रंटियर-स्केल मॉडल की कम्प्यूटेशनल और आर्थिक लागतों की सीधी प्रतिक्रिया के रूप में उभरा: जबकि अरब-पैरामीटर-प्लस आर्किटेक्चर व्यापक सामान्य क्षमता प्रदर्शित करते हैं, उनकी मेमोरी फुटप्रिंट, इन्फेरेंस विलंबता और ऊर्जा खपत उन्हें ऑन-डिवाइस डिप्लॉयमेंट, गोपनीयता-संवेदनशील अनुप्रयोगों और कम-बैंडविड्थ या ऑफ़लाइन परिचालन संदर्भों के साथ संरचनात्मक रूप से असंगत बनाती है।
मुख्य शोध कार्यक्रम कॉम्पैक्ट और फ्रंटियर मॉडलों के बीच क्षमता अंतर को कम कर रहा है, ज्ञान आसवन के संयोजन के माध्यम से — जिसमें एक बड़े शिक्षक मॉडल के आउटपुट वितरण के आधार पर एक छोटे छात्र मॉडल को प्रशिक्षित किया जाता है — संरचित और असंरचित प्रूनिंग, INT4 और INT8 प्रतिनिधित्व तक आक्रामक वेट क्वांटिज़ेशन, और पैरामीटर-कुशल फाइन-ट्यूनिंग विधियाँ जैसे LoRA और QLoRA, जो एक संपीड़ित आधार मॉडल को न्यूनतम अतिरिक्त कंप्यूट लागत पर डोमेन-विशिष्ट कार्यों के लिए अनुकूलित करती हैं।
नीचे अनुक्रमित प्रकाशन और पेटेंट मॉडल संपीड़न तकनीकों, क्वांटिज़ेशन एल्गोरिदम, डिस्टिलेशन प्रोटोकॉल, कुशल ट्रांसफार्मर आर्किटेक्चर, ऑन-डिवाइस इन्फेरेंस अनुकूलन, और डोमेन-विशिष्ट फाइन-ट्यूनिंग पाइपलाइनों से संबंधित हैं:
यह स्मॉल लैंग्वेज मॉडल्स (SLMs) पर अंग्रेजी में दुनिया भर के प्रकाशनों और पेटेंटों का हमारा नवीनतम चयन है, कई वैज्ञानिक ऑनलाइन पत्रिकाओं के बीच से, जो स्मॉल लैंग्वेज मॉडल, SLM, ऑन-डिवाइस लैंग्वेज मॉडल, एज लैंग्वेज मॉडल, कॉम्पैक्ट ट्रांसफार्मर, सब-7B पैरामीटर मॉडल, लैंग्वेज मॉडल कंप्रेशन, नॉलेज डिस्टिलेशन एनएलपी, संरचित प्रूनिंग लैंग्वेज मॉडल, असंरचित प्रूनिंग लैंग्वेज मॉडल, वेट क्वांटाइजेशन लैंग्वेज मॉडल, INT4 क्वांटाइजेशन एनएलपी, INT8 क्वांटाइजेशन एनएलपी, पैरामीटर-कुशल फाइन-ट्यूनिंग, LoRA फाइन-ट्यूनिंग, QLoRA फाइन-ट्यूनिंग, एडेप्टर ट्यूनिंग लैंग्वेज मॉडल, ऑन-डिवाइस इन्फेरेंस, एज इन्फेरेंस एनएलपी, स्पेकुलेटिव डिकोडिंग, मॉडल डिस्टिलेशन ट्रांसफार्मर, GGUF क्वांटाइजेशन फॉर्मेट और मिक्सचर-ऑफ-एक्सपर्ट्स कॉम्पैक्ट मॉडल पर वर्गीकृत और केंद्रित हैं।
Deformable high-strength aluminum alloy compositions and methods of making the same
Patent published on the 2026-06-04 in US under Ref US20260152827 by PURDUE RES FOUNDATION [US] (Zhang Xinghang [us], Wang Haiyan [us], Stegman Benjamin Thomas [us], Shang Anyu [us])
Abstract: [0000] An alloy comprising 92 at % aluminum, 2 at % titanium, 2 at % iron, 2 at % cobalt, and 2 at % nickel. A method of making an alloy is disclosed. The method contains the steps of providing particles of desired composition, utilizing a selective leaser melting (SLM) apparatus producing a first layer of the particles on a substrate and melting and solidifying a first group selected areas of the layer of particles, wherein the melting and the solidification results in an alloy of desired compo[...]
Our summary: The content describes a high-strength aluminum alloy with specific composition percentages. It outlines a method for creating the alloy using selective laser melting to achieve desired thickness and shape. The process involves layering particles, melting, and solidifying selected areas to form intermetallic structures.
aluminum alloy, selective laser melting, intermetallic lamellae, high-strength
Patent
Quantization-aware lora fine-tuning for llm
Patent published on the 2026-06-04 in US under Ref US20260154540 by MEDIATEK SINGAPORE PTE LTD [SG] (Lim Jia Yao Christopher [sg], Huang Ya-lin [tw], Li Huai-ting [tw], Wong Wai Mun [sg], Liang Jen-wei [tw], Lee Timothy Jun Jie [sg])
Abstract: [0000] In an aspect of the disclosure, a method of using a LoRA for inference with a FC layer of a LLM is provided. The method includes: dequantizing an INT input to an FP output; processing the FP output from the DQ and a first FP input from first weights of a down projection module of the LoRA, to output a first FP output; processing the first FP output from the first BMM and a second FP input from second weights of an up projection module of the LoRA, to output a second FP output; quantizing [...]
Our summary: The method describes using LoRA for inference in a fully connected layer of a large language model. It involves dequantizing inputs, processing them through down and up projection modules, and quantizing outputs. The final output is an INT inference result derived from the LoRA adjustments.
Quantization, LoRA, fine-tuning, LLM
Patent
Systems and methods for assisting operation and maintenance of marine machine equipment
Patent published on the 2026-06-03 in EP under Ref EP4752805 by ALFA LAVAL CORP AB [SE] (Karlsson Jimmie [se], Boman Jesper [se])
Abstract: [0001] The present invention relates to a method of operating and maintaining a piece of marine machine equipment. The piece of marine machine equipment is connected to a local processor. The method comprising the steps of obtaining a set of training data specific to the piece of marine machine equipment and training a Small Language Model (SLM) with the set of training data specific to the piece of marine machine equipment. The method further comprising the step of executing the trained SLM on [...]
Our summary: The invention describes a method for operating and maintaining marine machine equipment using a local processor. It involves training a Small Language Model (SLM) with specific training data for the equipment. The trained SLM provides offline operational advice utilizing real-time data from the equipment.
marine machine equipment, operational advice, Small Language Model, real-time data
Patent
Parameter-free method for efficient and accurate llm inference acceleration via speculative decoding
Patent published on the 2026-05-07 in WO under Ref WO2026092843 by MARZOLLO MICHELE [DE] (Marzollo Michele [de], Mueller Lorenz [de], Zhuang Jiawei [de], Roemer Niklas [de], Cavigelli Lukas [de])
Abstract: In some examples, apparatus and methods are provided for selecting a draft token sequence for verification by using a large language model, LLM. Different sources of statistics on text data (prompt, generated output, large dataset of text data) can be utilized in order to choose candidates to use for speculative decoding via look-ups.[...]
Our summary: This method accelerates LLM inference without parameters by using speculative decoding. It selects draft token sequences for verification through statistical analysis of text data. The approach utilizes various sources of statistics to optimize candidate selection for decoding.
speculative decoding, LLM inference, token sequence selection, text data statistics
Patent
Automated synthesis of planar linkage mechanisms with diverse joint types via spring-connected link models and contrastive graph learning
Published on 2026-03-28 by @OXFORD
Abstract: AbstractThe automated synthesis of planar linkage mechanisms has long been a challenge in mechanism design, requiring both geometric feasibility and motion accuracy. Recent advances in data-driven and neural network–based methods have shown promise in automating linkage synthesis, improving efficiency and scalability compared to traditional analytical or optimization-based techniques. Nevertheless, existing data-driven approaches remain limited in handling diverse joint configurations and ofte[...]
Our summary: This study presents a framework for automating the synthesis of planar linkage mechanisms using deep learning and physics-based modeling. It employs a spring-connected link model for diverse joint configurations and utilizes contrastive graph learning for efficient linkage retrieval. The method demonstrates improved accuracy and optimization stability compared to traditional approaches.
mechanism synthesis, deep learning, contrastive graph learning, optimization stability
Publication
Enhancing Whisper Fine-Tuning with Discrete Wavelet Transform-Based LoRA Initialization
Published on 2026-01-29 by Liang Lan, Molin Fang, Yuxuan Chen, Daliang Wang, Wenyong Wang @MDPI
Abstract: In low-resource automatic speech recognition (ASR) scenarios, parameter-efficient fine-tuning (PEFT) has become a crucial approach for adapting large pre-trained speech models. Although low-rank adaptation (LoRA) offers clear advantages in efficiency, stability, and deployment friendliness, its performance remains constrained because random initialization fails to capture the time–frequency structural characteristics of speech signals. To address this limitation, this work proposes[...]
Our summary: This work introduces a structured initialization mechanism combining LoRA with discrete wavelet transform for fine-tuning in low-resource ASR. The proposed DWTLoRA method enhances convergence speed, stability, and accuracy by aligning with speech signal characteristics. Experimental results show DWTLoRA outperforms standard LoRA and other PEFT methods in character error rate and training efficiency.
Fine-Tuning, Discrete Wavelet Transform, Low-Rank Adaptation, Automatic Speech Recognition
Publication
Influence and Optimization of Process Parameters on Surface Roughness of Selective Laser Melting of 316L Stainless Steel
Published on 2026-01-20 by Pin Dong, Kamonpong Jamkamon, Suppawat Chuvaree @MDPI
Abstract: To achieve better surface quality in selective laser melting (SLM), this study used 316L stainless steel powder and conducted a systematic design experiment to investigate the influence mechanism of process parameters on the surface roughness of the top and vertical surfaces. Response surface methodology (RSM) was then used for parameter optimization. The results showed that scanning speed has the greatest impact on surface roughness, followed by laser power, while scanning spacing has the least[...]
Our summary: This study investigates the impact of process parameters on the surface roughness of 316L stainless steel in selective laser melting. Scanning speed significantly affects surface quality, with optimal conditions identified for minimal roughness. The findings validate the effectiveness of the response surface methodology used for parameter optimization.
Selective Laser Melting, Surface Roughness, Process Parameters, Response Surface Methodology
Publication
A Lightweight LLM-Based Semantic–Spatial Inference Framework for Fine-Grained Urban POI Analysis
Published on 2026-01-16 by Zhuo Huang, Yixing Guo, Shuo Huang, Miaoxi Zhao @MDPI
Abstract: Unstructured POI name texts are widely used in fine-grained urban analysis, yet missing labels and semantic ambiguity often limit their value for spatial inference. This study proposes a large language model-based semantic–spatial inference framework (LLM-SSIF), a lightweight semantic–spatial pipeline that translates POI texts into interpretable, fine-grained spatial evidence through an end-to-end workflow that couples scalable label expansion with scale-controlled sp[...]
Our summary: This study introduces LLM-SSIF, a lightweight framework for translating unstructured POI texts into spatial evidence. It employs LoRA-based fine-tuning for efficient adaptation and enhances label coverage. The model demonstrates strong performance in urban analysis, revealing cultural differences between cities.
LLM, semantic inference, spatial analysis, fine-grained POI
Publication











