published at 2025/11/20

I have tried out different AI tools for development and admitted that how powerful and intelligent they could be, especially after knowing better how to use them, which really require efforts and experiments. As using them more and more, I realize that the importance of knowing how tool works under the hood, which can somewhat explain why it output like that. Then starting from there, you can think about what to modify and how to run it better and eventually find out your best way to run it.

Github Copilot is tool that my company allows to use and so far it is the one I use most. It is easy to get hands-on but the native modes it provides are not excellent frankly. But the good things are it allows user to customize it, unlocking the possibilities to perform better as you wish. While still in the middle of pursuing my own setup, at this point, I would like share how to understand better how GH Copilot works and hopefully this can be inspiring. The following tips is about running it in VSCode.

Check the full conversation - everything is a LLM call

Full conversation here means the messages sent to LLMs, including system prompts and user messages, which unfolds everything as a LLM call happening in an IDE, like you send a message in the chat interface to any LLM. This may sound less interesting and magical but it is the truth.

By opening the Output view and choose Github Copilot Chat, you could see all the interactions, or call them requests.

For each request, the model that is called and the mode you are using can be easily spotted:

ccreq:*<xxxxxxxx>.copilotmd* | unknown | claude-sonnet-4.5 | 2881ms | [panel/editAgent]

The most mind-blowing part is the log file, which contains everything you want to know and it is there open for you.

The file contains Metadata and all messages including the system prompts and also user messages. Details of the LLM call is disclosed from the metadata, from where you can know token consumption and response time easily.

requestType      : ChatCompletions
model            : claude-sonnet-4.5
maxPromptTokens  : 127997
maxResponseTokens: 16000
location         : 7
otherOptions     : {"temperature":0,"stream":true}
intent           : undefined
startTime        : <>
endTime          : <>
duration         : 85730ms
response rate    : 64.49 tokens/s
ourRequestId     : <>
requestId        : <>
serverRequestId  : <>
timeToFirstToken : 4330ms
resolved model   : claude-sonnet-4.5
usage            : {"completion_tokens":5629,"prompt_tokens":89884,"prompt_tokens_details":{"cached_tokens":10289},"total_tokens":95483}

Messages are much more interesting, as the plaintext of system prompts are just exposed to your eyes and how other instruction files are organized. For example, AGENTS.md is included in the system prompt as attachment as follows: