Exploring How AI Works Behind the Scenes

I’ve been working on a small experiment to see how far an AI agent can go in real-world coding — not just generating code snippets, but actually creating folders, generating files, and editing them automatically.

This project is a small CLI agent that uses Google's Gemini API to generate complete websites from natural language prompts. It's simple, safe, and practical — the agent executes one shell command at a time, reads/writes files, and scaffolds HTML/CSS/JS projects.

What it does (quick)

Accepts natural language prompts and calls Gemini.
Uses function-calling for safe helpers: executeCommand, readFile, writeFile, listFiles.

I recently built an open-source Node.js tool called node-gemini-ai-agent that lets you create or modify websites just by describing what you want — and more importantly, helps you understand how AI agents actually work behind the scenes.

It’s powered by Google’s Gemini model and can run shell commands, create folders, write files, and edit your code directly — like having a small AI developer working in your terminal.

What it actually does

You can ask the agent to:

Create new project folders automatically based on your idea.
Write or edit HTML, CSS, and JavaScript files.
Run safe shell commands for setup or testing.
Keep context between prompts so it remembers your previous steps.

Example:

text

> Build a portfolio site with a hero section

The agent guesses your project name, creates a folder, and scaffolds everything:

text

portfolio-site/
 ├── index.html
 ├── index.css
 └── index.js

Then you can continue:

text

> Change the hero section background to a video

and it’ll edit the right files automatically.

Setup

Clone and install:

bash

git clone https://github.com/sannjayy/node-gemini-ai-agent.git
cd node-gemini-ai-agent
npm install

Add your Gemini API key:

bash

export GEMINI_API_KEY=your_api_key

Start the agent:

bash

npm start

You’ll now get a terminal chat where you can describe your project directly.

Example: Build a Simple App in Minutes

Let’s say you want to make a food delivery landing page:

text

> Create a landing page for a food delivery app with navbar, hero section, and order button

What happens behind the scenes:

The agent reads your prompt and decides what tools it needs (like file creation or writing code).
It creates a new folder based on your prompt, e.g. food-delivery-app/.

Inside it, it writes:

text

food-delivery-app/
 ├── index.html
 ├── index.css
 └── index.js

It fills in placeholder content, sample styles, and connects everything.
It logs every step so you can see what it’s doing.

Then you can tweak it:

text

> Add footer with contact info
> Update button color to red
> Add short description below the hero image

Each instruction edits files in real-time.

What’s Cool About It

Automatic folder detection — No need to name anything; it guesses from your prompt.
Scaffolds instantly — HTML, CSS, JS are generated in seconds.
Real file edits — You can inspect the actual code after each step.
Safe and smart — Blocks harmful commands and times out safely.
Cross-platform — Works on macOS, Linux, and Windows.

This project is not just for building—it’s a window into how AI agents reason, choose actions, and modify code autonomously.

Understanding How It Works

Each time you type a prompt, the agent:

Sends your message to the Gemini model.
Gemini returns a plan — which “tools” to call next (like writeFile or executeCommand).
The Node.js app executes that plan and returns the results to Gemini.
Gemini checks the output, decides the next step, and continues until the task is done.

So you’re basically watching an AI reason, act, and iterate — one function call at a time.

Why I Built It

I wanted to make something that’s both practical and educational — a way to learn how AI agents really work, not just what they output. You can explore tool-based reasoning, file system control, and error handling — all in real code.

Try It Yourself

Repo: github.com/sannjayy/node-gemini-ai-agent

Clone it, chat with your terminal, and see how an AI agent actually builds and edits a project.

You’ll learn two things:

How fast you can scaffold ideas into working code.
How AI agents think in steps to complete real-world dev tasks.

If you build something fun with it, tag me (sannjayy_dev) — I’d love to see what kind of projects people create.

Exploring How AI Works Behind the Scenes

What it does (quick)

Accepts natural language prompts and calls Gemini.
Uses function-calling for safe helpers: executeCommand, readFile, writeFile, listFiles.

It’s powered by Google’s Gemini model and can run shell commands, create folders, write files, and edit your code directly — like having a small AI developer working in your terminal.

What it actually does

You can ask the agent to:

Create new project folders automatically based on your idea.
Write or edit HTML, CSS, and JavaScript files.
Run safe shell commands for setup or testing.
Keep context between prompts so it remembers your previous steps.

Example:

text

> Build a portfolio site with a hero section

The agent guesses your project name, creates a folder, and scaffolds everything:

text

portfolio-site/
 ├── index.html
 ├── index.css
 └── index.js

Then you can continue:

text

> Change the hero section background to a video

and it’ll edit the right files automatically.

Setup

Clone and install:

bash

git clone https://github.com/sannjayy/node-gemini-ai-agent.git
cd node-gemini-ai-agent
npm install

Add your Gemini API key:

bash

export GEMINI_API_KEY=your_api_key

Start the agent:

bash

npm start

You’ll now get a terminal chat where you can describe your project directly.

Example: Build a Simple App in Minutes

Let’s say you want to make a food delivery landing page:

text

> Create a landing page for a food delivery app with navbar, hero section, and order button

What happens behind the scenes:

The agent reads your prompt and decides what tools it needs (like file creation or writing code).
It creates a new folder based on your prompt, e.g. food-delivery-app/.

Inside it, it writes:

text

food-delivery-app/
 ├── index.html
 ├── index.css
 └── index.js

It fills in placeholder content, sample styles, and connects everything.
It logs every step so you can see what it’s doing.

Then you can tweak it:

text

> Add footer with contact info
> Update button color to red
> Add short description below the hero image

Each instruction edits files in real-time.

What’s Cool About It

Automatic folder detection — No need to name anything; it guesses from your prompt.
Scaffolds instantly — HTML, CSS, JS are generated in seconds.
Real file edits — You can inspect the actual code after each step.
Safe and smart — Blocks harmful commands and times out safely.
Cross-platform — Works on macOS, Linux, and Windows.

This project is not just for building—it’s a window into how AI agents reason, choose actions, and modify code autonomously.

Understanding How It Works

Each time you type a prompt, the agent:

Sends your message to the Gemini model.
Gemini returns a plan — which “tools” to call next (like writeFile or executeCommand).
The Node.js app executes that plan and returns the results to Gemini.
Gemini checks the output, decides the next step, and continues until the task is done.

So you’re basically watching an AI reason, act, and iterate — one function call at a time.

Why I Built It

Try It Yourself

Repo: github.com/sannjayy/node-gemini-ai-agent

Clone it, chat with your terminal, and see how an AI agent actually builds and edits a project.

You’ll learn two things:

How fast you can scaffold ideas into working code.
How AI agents think in steps to complete real-world dev tasks.

If you build something fun with it, tag me (sannjayy_dev) — I’d love to see what kind of projects people create.

Build Websites Using Own AI Agent

Exploring How AI Works Behind the Scenes

What it does (quick)

What it actually does

Setup

Example: Build a Simple App in Minutes

What’s Cool About It

Understanding How It Works

Why I Built It

Try It Yourself

Written by Sanjay Sikdar

Build Websites Using Own AI Agent

Exploring How AI Works Behind the Scenes

What it does (quick)

What it actually does

Setup

Example: Build a Simple App in Minutes

What’s Cool About It

Understanding How It Works

Why I Built It

Try It Yourself

Written by Sanjay Sikdar