Iqbal´s DLQ Help

Easy RAG - Using Embeddings in LangChain4j to improve LLM responses

TL;DR

Use embedding models to embed a CSV file, consequently improving the generation of the LLM responses for the article index

Intro

This is a follow-up to JBang Meets Spring Boot & LangChain4j: A Powerhouse for Java Scripting and AI Pipelines.

Our initial use case was to just prompt the LLM and chain the transformations of the Markdown article. Here we will enhance the .topic file with new articles and their summaries, which act as a parent to the articles:

The idea here is to generate summaries of each article, build a simple CSV file and use embedding models to pass these summaries as context to the LLM.

This, alongside the original topic file, will allow us to continuously rebuild the index you see here:

Generated Java topic index in WriterSide
Structure of a .topic file

The Process

We will focus on the green steps, because the previous ones are simple LLM prompting we covered in the previous article.

Diagram of the CSV embedding process flow

The Code

Setting the model properties for autowiring:

OpenAI

#OPENAI Chat Model langchain4j.open-ai.chat-model.api-key=${OPENAI_API_KEY} langchain4j.open-ai.chat-model.model-name=gpt-4o-mini langchain4j.open-ai.chat-model.log-requests=true langchain4j.open-ai.chat-model.log-responses=true langchain4j.open-ai.chat-model.timeout=1h #OPENAI Embedding Model langchain4j.open-ai.embedding-model.api-key=${OPENAI_API_KEY} langchain4j.open-ai.embedding-model.model-name=text-embedding-3-small

Gemini

langchain4j.google-ai-gemini.chat-model.api-key=${GEMINI_API_KEY} langchain4j.google-ai-gemini.enabled=true langchain4j.google-ai-gemini.chatModel.enabled=true langchain4j.google-ai-gemini.chatModel.modelName=gemini-2.5-pro-exp-03-25 langchain4j.google-ai-gemini.embedding-model.apiKey=${GEMINI_API_KEY} langchain4j.google-ai-gemini.embeddingModel.enabled=true langchain4j.google-ai-gemini.embeddingModel.modelName=text-embedding-004

The Runner

We will create a new runner for this use case:

package com.iqbalaissaoui.runners; import com.iqbalaissaoui.services.XMLTopicRefinerService; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.boot.CommandLineRunner; import org.springframework.context.annotation.Profile; import org.springframework.stereotype.Component; import java.io.IOException; import java.nio.file.Files; import java.nio.file.Path; import java.util.Optional; import static com.iqbalaissaoui.assistants.WriterSideConstants.WRITERSIDE_TOPICS; @Component @Profile("single & xml") public class XmlTopicGeneratorRunner implements CommandLineRunner { @Autowired private XMLTopicRefinerService xmlTopicRefinerService; @Override public void run(String... varargs) throws IOException { System.out.println("XmlTopicGeneratorRunner.run"); // check args that the first arg is an md file or throw illegalarg Optional.of(varargs) .filter(args -> args.length == 0) .ifPresent(args -> { throw new IllegalArgumentException("Please provide a topic file as an argument"); }); //check the argument is Path topic = WRITERSIDE_TOPICS.resolve(varargs[0]); // check if the file exists or throw illegalarg Optional.of(Files.exists(topic)) .filter(Boolean.FALSE::equals) .ifPresent(b -> { throw new IllegalArgumentException("The file does not exist"); }); // another check if the file is a topic file Optional.of(topic) .filter(path -> !path.getFileName().toString().endsWith(".topic")) .ifPresent(path -> { throw new IllegalArgumentException("The file is not a topic file"); }); xmlTopicRefinerService.refine(topic); } }

The Service:

This is where we create the summaries of the articles, build the index and use the embedding model to generate the context for the LLM:

generating the index file from the summaries

Here we use inference concurrently to summarize all articles and generate a simple CSV:

Summarizer AI Service

package com.iqbalaissaoui.assistants; import dev.langchain4j.service.SystemMessage; import dev.langchain4j.service.spring.AiService; @AiService public interface WriterSideMarkDownSumerizer { String SYSTEM_PROMPT = """ Summarize the key points of the content in a concise and clear manner, keeping the length suitable for the summary attribute. Do not exceed 30 words and do not include new lines or special characters. """; @SystemMessage(SYSTEM_PROMPT) String chat(String userMessage); }

index generation and inference

// create an index of the topics // hinting to the LLM to use valid hrefs Path index = Files.createTempFile("index", ".csv"); index.toFile().deleteOnExit(); Files.writeString(index, "filename;summary"); Files.writeString(index, System.lineSeparator()); List<String> indexLines = mds.stream() .parallel() .peek(md -> System.out.println("Processing file: " + md)) .map(p -> { try { return Map.entry(p.getFileName().toString(), writerSideMarkDownSumerizer.chat(Files.readString(p))); } catch (IOException e) { throw new RuntimeException(e); } }).map(e -> e.getKey() + ";" + e.getValue()) .toList(); indexLines.forEach(s -> { try { Files.writeString(index, s + System.lineSeparator(), StandardOpenOption.APPEND); } catch (IOException e) { throw new RuntimeException(e); } }); System.out.println("index = " + Files.readString(index)); //load the documents List<Document> documents = new ArrayList<>(); //load index as well documents.add(FileSystemDocumentLoader.loadDocument(index));

creating the embedding store and retriever

This is the snippet where we create the embedding store and retriever:

//create the embedding store InMemoryEmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>(); //ingest the documents with the autowired embedding model EmbeddingStoreIngestor embeddingStoreIngestor = EmbeddingStoreIngestor.builder() .embeddingModel(embeddingModel) .embeddingStore(embeddingStore) .build(); embeddingStoreIngestor.ingest(documents); //create the content retriever EmbeddingStoreContentRetriever embeddingStoreContentRetriever = EmbeddingStoreContentRetriever.builder() .embeddingModel(embeddingModel) .embeddingStore(embeddingStore) .build();

interface for the AI Service

This is a regular interface for the AI service, we will create the implementation programmatically instead of the annotation style used in the previous article to be able to pass the embedding model and the retriever:

package com.iqbalaissaoui.assistants; public interface WriterSideXmlTopicGenerator { String SYSTEM_PROMPT = """ You are an expert in JetBrains WriterSide, a technical documentation tool. Your task is to enhance XML.topic files by integrating references to Markdown (.md) articles provided as input. ### Instructions: **1. Input:** - An existing XML.topic file (if available). - A set of Markdown (.md) files that need to be referenced. **2. Task:** - Ensure all provided Markdown files are referenced in the XML.topic file using every filename and every summary as content for the primary section - Maintain proper XML.topic structure and formatting. - Improve the existing XML.topic file by integrating missing references while ensuring logical organization. **3. Output Requirements:** - Produce a valid, well-formed XML.topic file. - Ensure all Markdown files are correctly linked. - Avoid duplicate references. - Maintain consistent indentation and structure. **4. Constraints:** - Reference only the provided Markdown files—no external additions. - Group related topics logically based on filenames or inferred context when needed. """; String chat(String userMessage); }

inference with the embedding model

Final step, we create a LangChain4j Assistant programmatically this time in contrast to the previous article so that we can pass the embedding model and the retriever:

//create the writer side xml topic generator WriterSideXmlTopicGenerator writerSideXmlTopicGenerator = AiServices.builder(WriterSideXmlTopicGenerator.class) .chatLanguageModel(chatLanguageModel) .systemMessageProvider( _ -> WriterSideXmlTopicGenerator.SYSTEM_PROMPT ) .chatMemory(MessageWindowChatMemory.withMaxMessages(10)) .contentRetriever(embeddingStoreContentRetriever) .build(); String input = Files.readString(parent); String output = writerSideXmlTopicGenerator.chat(input);

Full Service Code

Putting the pieces together.

package com.iqbalaissaoui.services; import com.iqbalaissaoui.assistants.WriterSideMarkDownSumerizer; import com.iqbalaissaoui.assistants.WriterSideXmlTopicGenerator; import com.iqbalaissaoui.utils.FilesService; import dev.langchain4j.data.document.Document; import dev.langchain4j.data.document.loader.FileSystemDocumentLoader; import dev.langchain4j.data.segment.TextSegment; import dev.langchain4j.memory.chat.MessageWindowChatMemory; import dev.langchain4j.model.chat.ChatLanguageModel; import dev.langchain4j.model.embedding.EmbeddingModel; import dev.langchain4j.rag.content.retriever.EmbeddingStoreContentRetriever; import dev.langchain4j.service.AiServices; import dev.langchain4j.store.embedding.EmbeddingStoreIngestor; import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore; import org.apache.commons.lang3.StringUtils; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.context.annotation.Profile; import org.springframework.stereotype.Service; import org.xml.sax.SAXException; import javax.xml.parsers.ParserConfigurationException; import javax.xml.xpath.XPathExpressionException; import java.io.IOException; import java.nio.file.Files; import java.nio.file.Path; import java.nio.file.StandardOpenOption; import java.util.ArrayList; import java.util.List; import java.util.Map; import java.util.Optional; @Service @Profile("xml") public class XMLTopicRefinerService { @Autowired private WriterSideMarkDownSumerizer writerSideMarkDownSumerizer; @Autowired private ChatLanguageModel chatLanguageModel; @Autowired private EmbeddingModel embeddingModel; public void refine(Path parent) { try { List<Path> mds = FilesService.getMarkDownTopics(parent); // create an index of the topics // hinting to the LLM to use valid hrefs Path index = Files.createTempFile("index", ".csv"); index.toFile().deleteOnExit(); Files.writeString(index, "filename;summary"); Files.writeString(index, System.lineSeparator()); List<String> indexLines = mds.stream() .parallel() .peek(md -> System.out.println("Processing file: " + md)) .map(p -> { try { return Map.entry(p.getFileName().toString(), writerSideMarkDownSumerizer.chat(Files.readString(p))); } catch (IOException e) { throw new RuntimeException(e); } }).map(e -> e.getKey() + ";" + e.getValue()) .toList(); indexLines.forEach(s -> { try { Files.writeString(index, s + System.lineSeparator(), StandardOpenOption.APPEND); } catch (IOException e) { throw new RuntimeException(e); } }); System.out.println("index = " + Files.readString(index)); //load the documents List<Document> documents = new ArrayList<>(); //load index as well documents.add(FileSystemDocumentLoader.loadDocument(index)); //create the embedding store InMemoryEmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>(); //ingest the documents with the autowired embedding model EmbeddingStoreIngestor embeddingStoreIngestor = EmbeddingStoreIngestor.builder() .embeddingModel(embeddingModel) .embeddingStore(embeddingStore) .build(); embeddingStoreIngestor.ingest(documents); //create the content retriever EmbeddingStoreContentRetriever embeddingStoreContentRetriever = EmbeddingStoreContentRetriever.builder() .embeddingModel(embeddingModel) .embeddingStore(embeddingStore) .build(); //create the WriterSide xml topic generator WriterSideXmlTopicGenerator writerSideXmlTopicGenerator = AiServices.builder(WriterSideXmlTopicGenerator.class) .chatLanguageModel(chatLanguageModel) .systemMessageProvider( _ -> WriterSideXmlTopicGenerator.SYSTEM_PROMPT ) .chatMemory(MessageWindowChatMemory.withMaxMessages(10)) .contentRetriever(embeddingStoreContentRetriever) .build(); String input = Files.readString(parent); String output = writerSideXmlTopicGenerator.chat(input); Optional.of(StringUtils.difference(input, output)) .ifPresent(System.out::println); Files.writeString(parent, output); } catch (IOException | XPathExpressionException | ParserConfigurationException | SAXException e) { throw new RuntimeException(e); } } }

running the code

SPRING_PROFILES_ACTIVE=single,xml,gemini jbang JBangSpringBootApp.java Java.topic

And we get the following diff:

Diff output showing topic file modifications
10 May 2025