Easy RAG - Using Embeddings in LangChain4j to improve LLM responses

TL;DR

Use embedding models to embed a CSV file, consequently improving the generation of LLM responses for the article index.

Intro

This is a follow-up to JBang Meets Spring Boot & LangChain4j: A Powerhouse for Java Scripting and AI Pipelines.

Our initial use case was to just prompt the LLM and chain the transformations of the Markdown article. Here we will enhance the .topic file with new articles and their summaries, which act as a parent to the articles:

The idea here is to generate summaries of each article, build a simple CSV file, and use embedding models to pass these summaries as context to the LLM.

This, alongside the original topic file, will allow us to continuously rebuild the index you see here:

Generated Java topic index in WriterSide

The Process

We will focus on the green steps, because the previous ones are simple LLM prompting we covered in the previous article.

Diagram of the CSV embedding process flow

The Code

Setting the model properties for autowiring:

OpenAI

#OPENAI Chat Model
langchain4j.open-ai.chat-model.api-key=${OPENAI_API_KEY}
langchain4j.open-ai.chat-model.model-name=gpt-4o-mini
langchain4j.open-ai.chat-model.log-requests=true
langchain4j.open-ai.chat-model.log-responses=true
langchain4j.open-ai.chat-model.timeout=1h
#OPENAI Embedding Model
langchain4j.open-ai.embedding-model.api-key=${OPENAI_API_KEY}
langchain4j.open-ai.embedding-model.model-name=text-embedding-3-small

Gemini

langchain4j.google-ai-gemini.chat-model.api-key=${GEMINI_API_KEY}
langchain4j.google-ai-gemini.enabled=true
langchain4j.google-ai-gemini.chatModel.enabled=true
langchain4j.google-ai-gemini.chatModel.modelName=gemini-2.5-pro-exp-03-25

langchain4j.google-ai-gemini.embedding-model.apiKey=${GEMINI_API_KEY}
langchain4j.google-ai-gemini.embeddingModel.enabled=true
langchain4j.google-ai-gemini.embeddingModel.modelName=text-embedding-004

The Runner

We will create a new runner for this use case:

package com.iqbalaissaoui.runners;

import com.iqbalaissaoui.services.XMLTopicRefinerService;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.CommandLineRunner;
import org.springframework.context.annotation.Profile;
import org.springframework.stereotype.Component;

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.Optional;

import static com.iqbalaissaoui.assistants.WriterSideConstants.WRITERSIDE_TOPICS;

@Component
@Profile("single & xml")
public class XmlTopicGeneratorRunner implements CommandLineRunner {


    @Autowired
    private XMLTopicRefinerService xmlTopicRefinerService;


    @Override
    public void run(String... varargs) throws IOException {

        System.out.println("XmlTopicGeneratorRunner.run");

        // check args that the first arg is an md file or throw an illegal argument
        Optional.of(varargs)
                .filter(args -> args.length == 0)
                .ifPresent(args -> {
                    throw new IllegalArgumentException("Please provide a topic file as an argument");
                });


        //check the argument is

        Path topic = WRITERSIDE_TOPICS.resolve(varargs[0]);


        // check if the file exists or throw an illegal argument
        Optional.of(Files.exists(topic))
                .filter(Boolean.FALSE::equals)
                .ifPresent(b -> {
                    throw new IllegalArgumentException("The file does not exist");
                });

        // another check if the file is a topic file
        Optional.of(topic)
                .filter(path -> !path.getFileName().toString().endsWith(".topic"))
                .ifPresent(path -> {
                    throw new IllegalArgumentException("The file is not a topic file");
                });

        xmlTopicRefinerService.refine(topic);

    }

}

The Service:

This is where we create the summaries of the articles, build the index, and use the embedding model to generate the context for the LLM:

generating the index file from the summaries

Here we use inference concurrently to summarize all articles and generate a simple CSV:

Summarizer AI Service

package com.iqbalaissaoui.assistants;

import dev.langchain4j.service.SystemMessage;
import dev.langchain4j.service.spring.AiService;

@AiService
public interface WriterSideMarkDownSummarizer {

    String SYSTEM_PROMPT = """

            Summarize the key points of the content in a concise and clear manner, keeping the length suitable for the summary attribute.
            Do not exceed 30 words and do not include new lines or special characters.

            """;


    @SystemMessage(SYSTEM_PROMPT)
    String chat(String userMessage);
}

index generation and inference

// create an index of the topics
// hinting to the LLM to use valid hrefs
Path index = Files.createTempFile("index", ".csv");
            index.toFile().deleteOnExit();

            Files.writeString(index, "filename;summary");
            Files.writeString(index, System.lineSeparator());

List<String> indexLines = mds.stream()
        .parallel()
        .peek(md -> System.out.println("Processing file: " + md))
        .map(p -> {
            try {
                return Map.entry(p.getFileName().toString(), writerSideMarkDownSummarizer.chat(Files.readString(p)));
            } catch (IOException e) {
                throw new RuntimeException(e);
            }
        }).map(e -> e.getKey() + ";" + e.getValue())
        .toList();

            indexLines.forEach(s -> {
        try {
        Files.writeString(index, s + System.lineSeparator(), StandardOpenOption.APPEND);
        } catch (IOException e) {
        throw new RuntimeException(e);
                        }
                                });

                                System.out.println("index = " + Files.readString(index));


//load the documents
List<Document> documents = new ArrayList<>();

//load index as well
documents.add(FileSystemDocumentLoader.loadDocument(index));

creating the embedding store and retriever

This is the snippet where we create the embedding store and retriever:

//create the embedding store
InMemoryEmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>();

//ingest the documents with the autowired embedding model
EmbeddingStoreIngestor embeddingStoreIngestor = EmbeddingStoreIngestor.builder()
        .embeddingModel(embeddingModel)
        .embeddingStore(embeddingStore)
        .build();

            embeddingStoreIngestor.ingest(documents);

//create the content retriever
EmbeddingStoreContentRetriever embeddingStoreContentRetriever = EmbeddingStoreContentRetriever.builder()
        .embeddingModel(embeddingModel)
        .embeddingStore(embeddingStore)
        .build();

interface for the AI Service

This is a regular interface for the AI service, we will create the implementation programmatically instead of the annotation style used in the previous article to be able to pass the embedding model and the retriever:

package com.iqbalaissaoui.assistants;

public interface WriterSideXmlTopicGenerator {

    String SYSTEM_PROMPT = """
                You are an expert in JetBrains WriterSide, a technical documentation tool. Your task is to enhance .topic files by integrating references to Markdown (.md) articles provided as input.

                ### Instructions:
                **1. Input:**
                   - An existing .topic file (if available).
                   - A set of Markdown (.md) files that need to be referenced.

                **2. Task:**
                   - Ensure all provided Markdown files are referenced in the .topic file using every filename and every summary as content for the primary section
                   - Maintain proper .topic structure and formatting.
                   - Improve the existing .topic file by integrating missing references while ensuring logical organization.

                **3. Output Requirements:**
                   - Produce a valid, well-formed .topic file.
                   - Ensure all Markdown files are correctly linked.
                   - Avoid duplicate references.
                   - Maintain consistent indentation and structure.

                **4. Constraints:**
                   - Reference only the provided Markdown files—no external additions.
                   - Group related topics logically based on filenames or inferred context when needed.
            """;

    String chat(String userMessage);
}

inference with the embedding model

Final step, we create a LangChain4j Assistant programmatically this time in contrast to the previous article so that we can pass the embedding model and the retriever:

          //create the WriterSide .topic generator
            WriterSideXmlTopicGenerator writerSideXmlTopicGenerator = AiServices.builder(WriterSideXmlTopicGenerator.class)
                    .chatLanguageModel(chatLanguageModel)
                    .systemMessageProvider(
                            _ -> WriterSideXmlTopicGenerator.SYSTEM_PROMPT
                    )
                    .chatMemory(MessageWindowChatMemory.withMaxMessages(10))
                    .contentRetriever(embeddingStoreContentRetriever)
                    .build();


            String input = Files.readString(parent);
            String output = writerSideXmlTopicGenerator.chat(input);

Full Service Code

Putting the pieces together.

package com.iqbalaissaoui.services;

import com.iqbalaissaoui.assistants.WriterSideMarkDownSummarizer;
import com.iqbalaissaoui.assistants.WriterSideXmlTopicGenerator;
import com.iqbalaissaoui.utils.FilesService;
import dev.langchain4j.data.document.Document;
import dev.langchain4j.data.document.loader.FileSystemDocumentLoader;
import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.memory.chat.MessageWindowChatMemory;
import dev.langchain4j.model.chat.ChatLanguageModel;
import dev.langchain4j.model.embedding.EmbeddingModel;
import dev.langchain4j.rag.content.retriever.EmbeddingStoreContentRetriever;
import dev.langchain4j.service.AiServices;
import dev.langchain4j.store.embedding.EmbeddingStoreIngestor;
import dev.langchain4j.store.embedding.inmemory.InMemoryEmbeddingStore;
import org.apache.commons.lang3.StringUtils;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Profile;
import org.springframework.stereotype.Service;
import org.xml.sax.SAXException;

import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPathExpressionException;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.StandardOpenOption;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.Optional;

@Service
@Profile("xml")
public class XMLTopicRefinerService {

    @Autowired
    private WriterSideMarkDownSummarizer writerSideMarkDownSummarizer;

    @Autowired
    private ChatLanguageModel chatLanguageModel;

    @Autowired
    private EmbeddingModel embeddingModel;


    public void refine(Path parent) {

        try {

            List<Path> mds = FilesService.getMarkDownTopics(parent);

            // create an index of the topics
            // hinting to the LLM to use valid hrefs
            Path index = Files.createTempFile("index", ".csv");
            index.toFile().deleteOnExit();

            Files.writeString(index, "filename;summary");
            Files.writeString(index, System.lineSeparator());

            List<String> indexLines = mds.stream()
                    .parallel()
                    .peek(md -> System.out.println("Processing file: " + md))
                    .map(p -> {
                        try {
                            return Map.entry(p.getFileName().toString(), writerSideMarkDownSummarizer.chat(Files.readString(p)));
                        } catch (IOException e) {
                            throw new RuntimeException(e);
                        }
                    }).map(e -> e.getKey() + ";" + e.getValue())
                    .toList();

            indexLines.forEach(s -> {
                        try {
                            Files.writeString(index, s + System.lineSeparator(), StandardOpenOption.APPEND);
                        } catch (IOException e) {
                            throw new RuntimeException(e);
                        }
                    });

            System.out.println("index = " + Files.readString(index));


            //load the documents
            List<Document> documents = new ArrayList<>();

            //load index as well
            documents.add(FileSystemDocumentLoader.loadDocument(index));

            //create the embedding store
            InMemoryEmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>();

            //ingest the documents with the autowired embedding model
            EmbeddingStoreIngestor embeddingStoreIngestor = EmbeddingStoreIngestor.builder()
                    .embeddingModel(embeddingModel)
                    .embeddingStore(embeddingStore)
                    .build();

            embeddingStoreIngestor.ingest(documents);

            //create the content retriever
            EmbeddingStoreContentRetriever embeddingStoreContentRetriever = EmbeddingStoreContentRetriever.builder()
                    .embeddingModel(embeddingModel)
                    .embeddingStore(embeddingStore)
                    .build();


            //create the WriterSide .topic generator
            WriterSideXmlTopicGenerator writerSideXmlTopicGenerator = AiServices.builder(WriterSideXmlTopicGenerator.class)
                    .chatLanguageModel(chatLanguageModel)
                    .systemMessageProvider(
                            _ -> WriterSideXmlTopicGenerator.SYSTEM_PROMPT
                    )
                    .chatMemory(MessageWindowChatMemory.withMaxMessages(10))
                    .contentRetriever(embeddingStoreContentRetriever)
                    .build();


            String input = Files.readString(parent);
            String output = writerSideXmlTopicGenerator.chat(input);

            Optional.of(StringUtils.difference(input, output))
                    .ifPresent(System.out::println);

            Files.writeString(parent, output);


        } catch (IOException | XPathExpressionException | ParserConfigurationException | SAXException e) {
            throw new RuntimeException(e);
        }
    }


}

running the code

SPRING_PROFILES_ACTIVE=single,xml,gemini jbang JBangSpringBootApp.java Java.topic

And we get the following diff:

Diff output showing topic file modifications

15 May 2025