Langchain JSONToolkitでpackage-lock.jsonやtsconfig.jsonを読ませてみた

LangChainのToolkitやAgentを使って、JSONのデータを読んで回答を生成する方法を研究しています。JSONToolkitを試した結果、期待した結果を得られなかったが、結果を記事として残すことにしました。具体的には、package-lock.jsonのデータを読んでプロジェクトの依存関係を調べることを試しましたが、トークン数のエラーが発生しました。tsconfig.jsonを使用する場合も同様の問題があります。JSONToolkitは、JSONの中の情報を取り出して回答を生成するために使用することがおすすめです。JSONLoaderとRAG系のChainを使用すると、JSONの内容に対するコメントのリクエストに適していると思われます。

LangChainのToolkitやAgentなどで、自分の仕事や個人開発に活かせるものがないかをこの冬休みは研究しています。今回はJSONToolkitという、JSONのデータを読んで回答の生成ができる・・・様子のToolkitを試してみました。結論としては、あまり期待した結果を得ることはできませんでしたが、どのような結果が出たかを後で振り返れるように記事として残します。

期待しているユースケース

package-lock.jsonのデータを読ませることで、プロジェクトの依存関係、特にバージョンのコンフリクトや循環参照を調べることに使えないかなと思いました。そういうツールがすでにあるのは知っているのですが、自然文章ベースで聞けるならそれはそれで便利かなということで挑戦してみます。

`package-lock.json`を読ませてみる

早速読み込みさせてみました。JSONファイル自体はfs.readFileSyncで読ませます。

    let data: JsonObject;
    try {
      const json = fs.readFileSync(path.join(__dirname, '../package-lock.json'), "utf8");
      data = JSON.parse(json) as JsonObject;
      if (!data) {
        throw new Error("Failed to load OpenAPI spec");
      }
    } catch (e) {
      console.error(e);
      return;
    }
  
    const toolkit = new JsonToolkit(new JsonSpec(data));
    const model = new OpenAI({ 
        temperature: 0,
        openAIApiKey: env.openai.apiKey
     });
    const executor = createJsonAgent(model, toolkit);
  
    const input = `同じライブラリで、異なるバージョンが指定されている物があれば、その名前とバージョンを教えてください。`;
  
    console.log(`Executing with input "${input}"...`);
  
    const result = await executor.invoke({ input });
  
    console.log(`Got output ${result.output}`);
  
    console.log(
      `Got intermediate steps ${JSON.stringify(
        result.intermediateSteps,
        null,
        2
      )}`
    );

依存関係に関する質問をしたかったのですが、token数のエラーが発生しました。

BadRequestError: 400 This model's maximum context length is 4097 tokens, however you requested 4695 tokens (4439 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.

実際に使うとすると、Splitterでデータを分割する必要がありそうです。

`tsconfig.json`で試してみる

もうすこし軽量そうなもので試してみましょう。今度はtsconfig.jsonをソースにしてみました。対応が比較的わかりやすいものを質問してみます。

    const input = `
次のエラーが発生した場合、'tsconfig.json'をどのように修正すれば良いですか？
===
モジュール '"path"' は、'esModuleInterop' フラグを使用して既定でのみインポートできますts(1259)
path.d.ts(178, 5): このモジュールは、'export =' を使用して宣言されており、'esModuleInterop' フラグを使用する場合は既定のインポートでのみ使用できます。
    `;

実行結果がこちら。trueが返ってきているので、ちょっと想定された使い方ではない使い方をしたのかなという印象を受けました。

Got output true
Got intermediate steps [
  {
    "action": {
      "tool": "json_list_keys",
      "toolInput": "",
      "log": "Action: json_list_keys\nAction Input: "
    },
    "observation": "compilerOptions"
  },
  {
    "action": {
      "tool": "json_list_keys",
      "toolInput": "/compilerOptions",
      "log": " I should look at the keys that exist to see what I can query. I should use the 'json_list_keys' tool with the path to the compilerOptions.\nAction: json_list_keys\nAction Input: /compilerOptions"
    },
    "observation": "strict, jsx, jsxImportSource, esModuleInterop"
  },
  {
    "action": {
      "tool": "json_list_keys",
      "toolInput": "/compilerOptions/esModuleInterop",
      "log": " I should look at the keys that exist to see what I can query. I should use the 'json_list_keys' tool with the path to the compilerOptions.\nAction: json_list_keys\nAction Input: /compilerOptions/esModuleInterop"
    },
    "observation": "Error: Value at /compilerOptions/esModuleInterop is not a dictionary, get the value directly instead."
  },
  {
    "action": {
      "tool": "json_get_value",
      "toolInput": "/compilerOptions/esModuleInterop",
      "log": " I should use the 'json_get_value' tool to get the value at the path.\nAction: json_get_value\nAction Input: /compilerOptions/esModuleInterop"
    },
    "observation": "true"
  }
]

やってみて

JSONToolkitは、サンプルのように「JSONの中に書かれているものを取り出して回答を作る」系に使うのが良いのかもしれません。

import * as fs from "fs";
import * as yaml from "js-yaml";
import { OpenAI } from "langchain/llms/openai";
import { JsonSpec, JsonObject } from "langchain/tools";
import { JsonToolkit, createJsonAgent } from "langchain/agents";

export const run = async () => {
  let data: JsonObject;
  try {
    const yamlFile = fs.readFileSync("openai_openapi.yaml", "utf8");
    data = yaml.load(yamlFile) as JsonObject;
    if (!data) {
      throw new Error("Failed to load OpenAPI spec");
    }
  } catch (e) {
    console.error(e);
    return;
  }

  const toolkit = new JsonToolkit(new JsonSpec(data));
  const model = new OpenAI({ temperature: 0 });
  const executor = createJsonAgent(model, toolkit);

  const input = `What are the required parameters in the request body to the /completions endpoint?`;

  console.log(`Executing with input "${input}"...`);

  const result = await executor.invoke({ input });

  console.log(`Got output ${result.output}`);

  console.log(
    `Got intermediate steps ${JSON.stringify(
      result.intermediateSteps,
      null,
      2
    )}`
  );
};

この手の「JSONの内容に対してコメントを求めるリクエストは」は、JSONLoaderとRAG系のChainを使う方が良いのかもしれません。

参考記事

https://js.langchain.com/docs/integrations/toolkits/json