In late June 2025, Google released Gemini CLI, an open-source AI agent designed to integrate the power of its flagship Gemini 2.5 Pro model directly into the developer’s command line interface. Getting started with Gemini CLI is a straightforward process, but it requires a clear understanding of its core dependency and authentication methods. This guide will walk you through setting up the tool for both casual and professional use.
Prerequisites
The single most important prerequisite for the official google-gemini/gemini-cli
tool is Node.js version 18 or higher. While Google provides Software Development Kits (SDKs) for interacting with the Gemini API in other languages like Python, Go, and Java, the official agentic CLI tool itself is built on the Node.js runtime.
You can verify your Node.js version by running: node -v
If you do not have Node.js installed, you can download it from the official website at nodejs.org
Installation
You have two primary methods for installing and running Gemini CLI, depending on your intended usage.
- One-Off Execution with
npx
(Recommended for First-Time Use) For those who want to try Gemini CLI without a permanent global installation, thenpx
command is ideal. It fetches and runs the package without installing it globally. Open your terminal and run :npx https://github.com/google-gemini/gemini-cli
- Global Installation with
npm
(Recommended for Regular Use) For regular use, installing the package globally is more convenient. This makes thegemini
command available from any directory in your terminal. Run the following command :npm install -g @google/gemini-cli
Once the installation is complete, you can launch the CLI simply by typing:gemini
Authentication Deep Dive
Gemini CLI offers two main authentication methods, catering to different needs for usage limits and data privacy. Upon first launch, you will be prompted to select a method.
Method 1: The Free Tier (Google Account Login) This is the default and recommended method for individual developers, students, and hobbyists.
- Process: When you select “Login with Google,” the CLI will open your default web browser for a standard OAuth authentication flow.
- Benefits: This method is completely free and grants you access to the powerful Gemini 2.5 Pro model with a generous usage allowance: 60 model requests per minute and 1,000 requests per day.
- Privacy Note: Be aware that when using the free tier, your prompts may be used to improve Google’s models.
Method 2: The Power User (API Key) This method is for professional developers, teams, or anyone who needs higher usage limits, access to specific models, or guaranteed data privacy.
- Process:
- Generate an API Key: Go to Google AI Studio at
https://aistudio.google.com/app/apikey
to create a new API key. - Set the Environment Variable: The CLI automatically detects the API key if it’s set as an environment variable named
GEMINI_API_KEY
. This is the most secure and convenient way to use an API key.
- macOS / Linux (Bash/Zsh): Add this to your
.zshrc
or.bashrc
file for persistence.export GEMINI_API_KEY="YOUR_API_KEY"
- Windows (Command Prompt): This is for the current session only.
set GEMINI_API_KEY="your_api_key_here"
- Windows (PowerShell): This is for the current session only. To make it persistent, add it to your PowerShell profile script.
$env:GEMINI_API_KEY = "your_api_key_here"
- Generate an API Key: Go to Google AI Studio at
- Benefits: Using an API key ensures your data will not be used for model training and allows you to access usage-based billing through Google AI Studio or Vertex AI for needs exceeding the free tier.
It is important to clarify the distinction between different “Gemini” products to avoid confusion. The Gemini API is the backend service that processes requests. The Gemini SDKs are language-specific libraries (e.g., for Python, Go) used to call that API programmatically. The
Gemini CLI is the specific Node.js-based agentic tool that is the subject of this guide. This is particularly important because an older, unaffiliated, and now-deprecated Go-based tool also named gemini-cli
exists, which can cause confusion. Always ensure you are using the official
google-gemini/gemini-cli
package.
Mastering the Core Workflow
Once installed and authenticated, interacting with Gemini CLI happens within a purpose-built terminal interface. Understanding its components and commands is key to unlocking its full potential.
Anatomy of the CLI
The interactive interface is designed for clarity and efficiency. Key elements include :
- The Prompt (
>
): This is where you type your natural language requests to the agent. - Streaming Output: As the agent reasons and generates responses or code, the text streams into the terminal in real-time, providing immediate feedback rather than making you wait for the full response to complete.
- Status Bar: Located at the bottom of the screen, this crucial element provides at-a-glance information about the current session, including the working directory, the model being used (e.g.,
gemini-2.5-pro
), and the remaining token context.
Essential Gemini CLI Commands and Syntax
A set of “slash commands” and special character syntax provides powerful control over the agent and its environment. This table serves as a quick-reference cheat sheet.
Command / Syntax | Description | Example |
/help | Displays a comprehensive help menu with available commands and keyboard shortcuts. | /help |
/tools | Lists all the built-in tools the agent can currently use, such as ReadFile or Shell . | /tools |
/auth | Allows you to switch the authentication method (e.g., from Google Login to API Key) within a session. | /auth |
/theme | Lets you change the color theme of the CLI interface. | /theme |
/stats | Shows usage statistics for the current session, including token counts. | /stats |
/memory | Manages the agent’s short-term memory, allowing you to view or clear stored facts. | /memory |
/chat <cmd> | Manages chat sessions. list shows saved chats, save <tag> saves the current one, resume <tag> loads a saved chat. | /chat save my_project_session |
! <command> | Shell passthrough. Executes any shell command directly and shows the output to you and the agent. | !npm test |
@<file/folder> | Adds a specific file or the contents of an entire folder to the agent’s context for the next prompt. | > Summarize the key points in @src/main.js |
GEMINI.md | A file in the project root used to provide persistent, project-level instructions or context to the agent. | N/A (File-based) |
Context is King: The @
Specifier and GEMINI.md
Two of the most powerful features for controlling the agent’s behavior are the @
specifier and the GEMINI.md
file.
- The
@
Specifier: By prefixing a file or folder path with@
, you explicitly tell the agent which parts of your codebase to focus on for a specific task. This is far more efficient than relying on the agent to guess which files are relevant. For example,> Refactor the function in @api/utils.ts to use async/await
directs the agent’s attention precisely where it’s needed. - The
GEMINI.md
File: For context that should apply across an entire session, create a file namedGEMINI.md
in the root of your project directory. The agent reads this file at the start of a session and uses its contents as high-level instructions. This is the ideal place to define project-wide rules, such as “Always use functional components in React,” “Follow the TDD methodology,” or “The primary database is PostgreSQL”.
A Practical Project Walkthrough
The best way to understand Gemini CLI’s capabilities is to build something with it. This walkthrough will guide you through creating a complete React-based Task Manager application from scratch, using only natural language prompts. This project is based on a real-world example of using the CLI.
Project Goal: Build a task management app with React, TypeScript, and Tailwind CSS. Features will include adding, editing, completing, archiving, and deleting tasks.
Step 1: Project Scaffolding
First, create a new directory for your project, navigate into it, and launch the Gemini CLI. mkdir react-task-manager && cd react-task-manager
gemini
Once inside the CLI, issue your first prompt to have the agent set up the entire project structure. > Initialize a new React project with TypeScript and Tailwind CSS using Vite.
The agent will now use its Shell
and WriteFile
tools to run the Vite creation command (npm create vite@latest. -- --template react-ts
) and generate the necessary configuration files like tailwind.config.js
, vite.config.ts
, and tsconfig.json
. This single prompt saves significant manual setup time.
Step 2: Building the UI
With the project structure in place, instruct the agent to create the core UI components. > Create the main App.tsx component to hold the application state. Also, create a separate component in src/components/TaskForm.tsx for the task input form.
The agent will generate the boilerplate for these two files. You can then ask it to flesh them out. > In App.tsx, set up a state variable using useState to hold an array of task objects. Each task should have an id, text, and a completed status. Populate it with two dummy tasks.
> In TaskForm.tsx, create a form with a textarea for task input and a submit button styled with Tailwind CSS.
Step 3: Iterative Feature Development
Now, build the application’s functionality feature by feature, demonstrating the agentic workflow.
Add a Task: > Implement the logic to add a new task. The addTask function in App.tsx should take the text from the TaskForm, create a new task object, and add it to the tasks array.
Edit a Task: This is where the agent’s ability to modify existing code shines. > Add the ability to edit a task. When a user clicks an 'Edit' button next to a task, the task's text should become an editable textarea. There should be 'Save' and 'Cancel' buttons to commit or discard the changes.
The agent will analyze App.tsx
, add the necessary state to track which task is being edited, and generate the required event handlers (startEditing
, saveTask
, cancelEditing
) and conditional JSX to render either the task text or the edit form.
Step 4: Debugging and Refactoring
To see how the agent handles code improvement, give it a refactoring task. > The state management in App.tsx is getting complex. Refactor the state logic to use the useReducer hook instead of multiple useState calls for better scalability.
The agent will read the current App.tsx
file, understand the existing state logic, and then use its Edit
tool to replace the useState
calls with a useReducer
implementation, including the reducer function and dispatch calls. Before applying the changes, it will show you a diff of the proposed modifications for your approval, putting you in full control of the process.
Following this iterative process of prompting, reviewing, and approving, you can continue to add features like archiving, deleting, and filtering tasks until you have a fully functional application, built almost entirely through conversational commands.
Advanced Techniques
Beyond basic app creation, Gemini CLI offers advanced capabilities that can dramatically accelerate complex workflows.
Multimodal Development: From Sketch to Code
One of the most powerful features is the ability to translate visual concepts into code. This tutorial demonstrates how to generate a component from a simple image.
- Prepare the Image: Create or find a simple image of a UI element, such as a user profile card. Save it in your project directory as
ui_sketch.jpg
. - Launch Gemini CLI: Run
gemini
from your project’s root directory. - Craft the Prompt: Use the
@
specifier to reference the image file in your prompt.> Look at the user profile card in @ui_sketch.jpg. Generate a new React component named 'ProfileCard.tsx' using TypeScript and Tailwind CSS to build this UI. The component should accept props for the user's name, handle, and avatar URL.
- Observe and Approve: The agent will process the image, reason about its structure (an avatar, a name, a handle), and then propose to use the
WriteFile
tool to createsrc/components/ProfileCard.tsx
with the corresponding JSX and TypeScript definitions. After you approve, the file will be created on your disk.
Automating with Shell Scripts
Gemini CLI can be invoked non-interactively from scripts, making it a powerful tool for automation. The following example script demonstrates how to add a copyright header to all JavaScript files in a project.
Create a file named batch_copyright.sh
and add the following content:
Bash
#!/bin/bash
# batch_copyright.sh: Adds copyright headers to all.js files using Gemini CLI.
if [ -z "$1" ]; then
echo "Usage: $0 <copyright_year>"
exit 1
fi
COPYRIGHT_YEAR=$1
# Find all JavaScript files in the src directory
find src -name "*.js" | while read -r file; do
echo "Processing $file..."
# Create a temporary file for the new content
TEMP_FILE=$(mktemp)
# Combine the existing file content with a prompt for Gemini CLI
(cat "$file"; echo -e "\n\n---\nAdd the following copyright header to the top of this JavaScript code:\n// Copyright (c) $COPYRIGHT_YEAR My Company, Inc. All rights reserved.") | gemini > "$TEMP_FILE"
# Safely replace the original file if the command was successful
if; then
mv "$TEMP_FILE" "$file"
echo "✅ Updated $file"
else
echo "❌ Error processing $file"
rm "$TEMP_FILE"
fi
done
Make the script executable (chmod +x batch_copyright.sh
) and run it (./batch_copyright.sh 2025
). This script iterates through each .js
file, pipes its content along with a prompt into Gemini CLI, and overwrites the original file with the AI-modified output.
Extending with MCP (A Conceptual Guide)
The Model Context Protocol (MCP) is Gemini CLI’s mechanism for extensibility, allowing it to learn new, custom skills. While implementing a full MCP server is beyond this guide’s scope, the concept is straightforward.
Imagine you want to teach Gemini CLI to interact with your company’s internal bug tracking API.
- Create a Tool Server: You would build a simple web server (e.g., using Python/Flask or Node.js/Express) that exposes an endpoint, such as
POST /query-bugs
. This endpoint would take a JSON payload (e.g.,{"assignee": "user@example.com"}
), query your bug tracker, and return the results. This is your MCP server. - Register the Server: In Gemini CLI’s settings file (
.gemini/settings.json
), you would register the URL of your MCP server. - Use the New Tool: Now, you could simply prompt the agent in natural language:
> Find all open bugs assigned to me in the 'Phoenix' project.
The agent, aware of its new tool, would reason that it needs to call your/query-bugs
endpoint, formulate the correct request, and present the results to you.
Troubleshooting and Best Practices
Even powerful tools can have quirks. This chapter covers common problems, error codes, and best practices to ensure a smooth experience.
Diagnosing Common Problems
- Authentication Failures: A frequent issue arises when users with Google Workspace accounts try to log in, or when they use the same account across multiple environments (e.g., a local machine and a cloud-based IDE like GitHub Codespaces). This can lead to errors about a
GOOGLE_CLOUD_PROJECT
env var being required.- Solution: The most reliable fix is to switch to the API key authentication method. This bypasses the complex OAuth flow and provides a more stable connection.
- Antivirus/Firewall Conflicts: Some antivirus or firewall software may block
node.js
orpowershell.exe
from opening a local port, which is required for the browser-based authentication to work. This results in the authentication process timing out or failing to connect.- Solution: A clean-slate approach is often necessary. The recommended steps are: 1) Completely uninstall the antivirus software. 2) Completely uninstall Node.js. 3) Restart the computer. 4) Reinstall Node.js. 5) Run the Gemini CLI setup and complete the authentication. 6) Reinstall the antivirus software and add explicit exceptions for the
node.js
andpowershell.exe
executables.
- Solution: A clean-slate approach is often necessary. The recommended steps are: 1) Completely uninstall the antivirus software. 2) Completely uninstall Node.js. 3) Restart the computer. 4) Reinstall Node.js. 5) Run the Gemini CLI setup and complete the authentication. 6) Reinstall the antivirus software and add explicit exceptions for the
Common Gemini API Error Codes and Solutions
When the CLI interacts with the backend API, you may encounter HTTP error codes. This table translates the most common ones into actionable solutions.
HTTP Code | Status | Common Cause | Recommended Solution |
400 | INVALID_ARGUMENT | The request is malformed (e.g., a typo in a parameter). | Check the API documentation for the correct request format and parameters. |
403 | PERMISSION_DENIED | The API key is invalid or lacks the required permissions. | Verify your API key is correct and has the necessary access rights in your Google Cloud project. |
429 | RESOURCE_EXHAUSTED | You have exceeded your rate limit (e.g., >60 requests/minute on the free tier). | Wait and retry your request after a minute. For higher needs, request a quota increase or use a paid plan. |
500 | INTERNAL | An unexpected error occurred on Google’s side, or the prompt context is too long. | Try reducing the size of your prompt or the number of files included with @ . If the issue persists, wait and retry. |
503 | UNAVAILABLE | The service is temporarily overloaded or down. | Wait a few moments and retry your request. Consider temporarily switching to a different model (e.g., gemini-2.5-flash ). |
The Conversation History Dilemma
One of the most significant limitations of Gemini CLI in its current version is the management of conversation history.
- The Problem: There is no built-in feature to automatically save or export a session’s conversation history. When you close the terminal, the entire interaction is lost unless manually copied.
- Current Workarounds:
- Manual Copy/Paste: The most basic method is to manually select and copy the text from your terminal before closing the session.
/chat
Commands: The CLI provides/chat save <tag>
and/chat resume <tag>
commands. However, users have reported this workflow as cumbersome, as it requires manually saving the state frequently.
- The Future: This is a known issue, and a feature request to add a proper
/export-conversation
command is being tracked publicly on the project’s GitHub page in Issue #2554. Users can monitor this issue for updates.
Pro-Tips for Effective Prompting
- Be Specific: Vague requests lead to vague results. Instead of “Fix my code,” try “The
getUser
function in@src/api.js
is throwing a null pointer exception when the user is not logged in. Add a check to handle this case gracefully.”. - Decompose Tasks: For complex goals, break the problem down into smaller, logical steps and prompt the agent for each one sequentially.
- Use the Shell: Leverage the
!
shell passthrough to run verification commands like!npm test
or!eslint.
after the agent makes changes, allowing you to validate its work without leaving the interface. - Trust but Verify: Always use the Human-in-the-Loop (HiTL) system wisely. Carefully review the agent’s plan and the code diffs it proposes before granting it permission to write files or execute potentially destructive commands.
Your Agentic Co-Pilot
Gemini CLI represents a significant step forward in the evolution of developer tools, transforming the command line from a simple interface into an intelligent, conversational, and agentic workspace. By mastering its installation, core workflow, advanced features, and troubleshooting techniques, developers can unlock new levels of productivity, automating tedious tasks and accelerating the journey from idea to implementation. The tool is still young, with known limitations like conversation management that are actively being addressed. Its true potential, however, lies in its open-source nature. Developers are encouraged not only to use Gemini CLI as a tool but to participate in its growth—by contributing to the project on GitHub, reporting issues, and building a vibrant ecosystem of custom MCP servers that will collectively teach this powerful agent new skills, shaping the future of software development for everyone.
Leave a Reply
You must be logged in to post a comment.