Showing posts with label github. Show all posts
Showing posts with label github. Show all posts

Apps That See: Bringing Vision AI to Your Projects

I was wearing a t-shirt with a partial Reka logo at the edge of the frame. I never said the word "Reka" in that segment. The model caught the logo, connected it to the topic I was discussing, and mentioned it unprompted in the output it generated.

That is not a transcript trick. The model was watching.

At the AI Agents Conference 2026, I gave a talk called "Apps That See" — six live demos showing how to build applications that understand images and video. Every project is open source and ready to clone. This post walks through each one so you have enough context to pick it up, run it, and adapt it to something useful in your own work.

Vision AI Is Accessible Now

Not long ago, working with visual AI meant GPU clusters, specialized teams, and weeks of training. Today a compressed 4B model like Qwen or Gemini 3 runs on a regular laptop and handles image description well enough to prototype. Step up to a 7B model like Reka Edge and the quality improves meaningfully. It also runs locally: a gaming PC with a decent GPU is enough. No server required.

For tasks that need more power, cloud APIs give you faster results without local hardware requirements. The tradeoff is that your images and video go to a third-party provider. For corridor cameras or stock photos that is usually acceptable. For private or sensitive content, local is the better default.

The practical pattern: start local to build and test, then decide whether the task actually requires cloud.

What You Can Build With This

  • Accessibility: Describe a scene in real time for visually impaired users, or identify objects on demand.
  • Content creation: Extract structure from a video and turn it into a blog post, caption set, or highlight reel.
  • Productivity: Search through thousands of videos for a specific object or topic, even when the title gives no indication of the content.
  • Automation: Trigger actions only when specific visual conditions are met, such as an unrecognized person entering a room.
  • Fun: Most developers' first contact with AI is building something for themselves, and that is a perfectly valid starting point.


Demo 1: Caption This — Generate a Prompt from Any Image


Source: fboucher/caption-this

If you work with image generation models, you end up with a lot of images to test and compare. Writing the text prompt that would reproduce a specific image is tedious. This tool does it for you: give it an image, get back a prompt you can use to regenerate something similar.

The demo uses an HTTP client extension in VS Code to call the API directly, no SDK. Pass an image, ask for a plain-text prompt that would recreate it. One prompt detail that improved results noticeably: add no markdown to the instruction.

POST https://api.reka.ai/v1/chat
Content-Type: application/json

{
  "model": "reka-flash",
  "messages": [{
    "role": "user",
    "content": [
      { "type": "image_url", "image_url": { "url": "https://..." } },
      { "type": "text", "text": "Write a prompt in plain text, no markdown, that would generate the exact same image." }
    ]
  }]
}

One thing to know when testing this across different models: some accept an image URL directly, others require the image as a base64-encoded string. Same task, same prompt, different input contract. If you plan to swap models in your app, account for this difference from the start.

Demo 2: Media Library — Compare Vision Models Side by Side


Source: fboucher/media-library

This is a web app that connects to multiple vision backends and lets you switch between them at runtime. The motivation: benchmark Reka Edge running locally — via OpenRouter or directly through the Reka API — against other models on real tasks.

Object detection surfaces the biggest portability problem. Some models return bounding boxes in an HTML-style bracket format with pixel coordinates. Others use a 2D box structure with a different coordinate scheme. If you code against one format and then swap models, your rendering breaks. There is no standard here — handle the differences at the application layer, not the model layer.

The app uses the OpenAI API format as the common interface across all backends. Any model with a compatible endpoint can be swapped in with minimal changes. It does not eliminate the per-model quirks, but it reduces the friction of switching to a configuration change rather than a rewrite.

Video input is supported too, though far fewer models handle it than images. Of the models tested, Reka Edge is the standout for video — the others either reject it or behave inconsistently.

Demo 3: Video2Blog — Turn a Video into a Structured Post


Source: fboucher/video2blog

I built this for myself. I do a lot of tutorial videos and I wanted a tool that would turn a recording into a structured blog post without me having to write one from scratch.

The tool sends the video to a vision model with a detailed prompt: target structure, tone, format, and an instruction to flag moments where a screenshot would add value. The model returns timestamps — it cannot extract frames itself, but it tells you exactly where to look, and you pull them locally with ffmpeg.

That creates one architectural quirk worth knowing: the video lives in two places. ffmpeg needs it locally to extract frames. The hosted model needs it uploaded to analyze content. For a one-evening project it works well enough, and I use it often enough that it has paid for itself many times over.

After the first draft, you stay in a conversation loop: change the tone, translate to French, swap a timestamp, restructure a section. The model holds context and iterates with you until the result is what you want.

Demo 4: Video Analyzer — Search and Query Your Video Library


Source: reka-ai/api-examples-dotnet

Most video search runs on titles, descriptions, and transcribed audio. This demo searches by what is actually visible on screen.

The app pre-indexes a video library by sending each video through a vision model ahead of time. When a query arrives, the heavy work is already done. A search for "robot arm" returns the right video — a clip of a robotic arm animation. It also returns a false positive: fast-moving hands apparently looked close enough to fool the model. Useful, not perfect, and worth designing around in your UX.

The Q&A feature goes further. You pick a video and ask a specific question. "What database was used?" returned MySQL — and noted it was running in a Docker container. The model identified that from watching the screen, not from audio. No transcript needed.

From there, you can generate study materials from any recorded session. The demo produces a multiple-choice quiz with answer options, correct answers, and explanations. The model is doing comprehension, not transcription.

Demo 5: Roast My Life — What the Model Actually Sees


Source: reka-ai/api-examples-python

I never mentioned the pictures on my wall. The model did.

In a video about Python and AI, the model's generated blog post made a remark about the artwork hanging behind me. I had said nothing about it. The model noticed, mentioned it, and moved on as if it were obvious.

Then there was the t-shirt moment described at the top of this post. A partial logo, half out of frame, no mention of it anywhere in the audio — and the model connected it to the topic anyway.

This demo is named Roast My Life because the model ends up commenting on things you never intended to share. But the real point is what it reveals: a vision model is not a smarter transcript. It is watching. The larger models do this particularly well, and once you see it, it changes how you think about what these tools can do — and what they will pick up without you asking.

Demo 6: N8N Automation — No-Code Video Clipping Pipeline


Sources: N8N Reka Vision integration

Vision AI does not always need custom code. This demo wires everything together in N8N, a visual workflow tool, with no programming required.

The trigger is a new video published to YouTube. The workflow finds an engaging clip, reformats it from horizontal to vertical, adds captions in a specific style (all lowercase, specific colors — chosen to be obviously distinct from any default), and sends an email with the finished clip attached. The whole thing runs automatically.

For developers, this pattern is worth knowing even if you code everything else. Many real business workflows have a vision AI step that fits cleanly into a larger automation, and a no-code tool is often the fastest way to ship it.


Watch the Full Talk

The demos above are the written version. The live version, with the actual code running, models responding in real time, and a few things going sideways in interesting ways, is on YouTube.


All the Code

The demos span Python, C#, raw HTTP, Go, and N8N. Vision AI is not tied to a specific stack — if your environment can make an HTTP request, it can call a vision model.

All projects:


Reading Notes #688

I'm always on the lookout for innovative ideas to streamline my development workflow. This week, I stumbled upon some fascinating reads that caught my eye, among them, an article about building an AI-powered pull request agent using GitHub Copilot SDK, and another demonstrating the secure use of OpenClaw in Docker sandboxes.


AI

Programming

~frank


Reading Notes #683

A lot of good stuff crossed my radar this week. From Aspire’s continued evolution and local AI workflows with Ollama, to smarter, more contextual help in GitHub Copilot, the theme is clear: better tools, used more intentionally. I also bookmarked a few thoughtful pieces on leadership and communication that are worth slowing down for. Plenty here to explore, whether you’re deep in code or thinking about how teams actually work.

Meetup MsDevMtl

Programming

AI

Open Source

  • The end of the curl bug-bounty (Daniel Stenberg) - I didn't know about this effort, and it's sad to learn about it too now, of course, but I'm glad those programs exist.

Miscellaneous

  • Why I Still Write Code as an Engineering Manager (James Sturtevant) - There is still hope, everyone! But more seriously, an inspiring post that managers should read.

  • The Art of the Oner (Golnaz) - Another great post from Golnaz talks about how to help the message to land. How and why one takes are helping when presenting and the effort it represents.

Sharing my Reading Notes is a habit I started a long time ago, where I share a list of all the articles, blog posts, and books that catch my interest during the week.

If you have interesting content, share it!

~frank



Automatically Create AI Clips with This n8n Template

I'm excited to share that my new n8n template has been approved and is now available for everyone to use! This template automates the process of creating AI-generated video clips from YouTube videos and sending notifications directly to your inbox.

French version of this post here

Try the template here: https://link.reka.ai/n8n-template-api


What Does This Template Do?

If you've ever wanted to automatically create short clips from long YouTube videos, this template is for you. It watches a YouTube channel of your choice, and whenever a new video is published, it uses AI to generate engaging short clips perfect for social media. You get notified by email when your clip is ready to download.

How It Works

The workflow is straightforward and runs completely on autopilot:

  1. Monitor YouTube channels - The template watches the RSS feed of any YouTube channel you specify. When a new video appears, the automation kicks off.

  2. Request AI clip generation - Using Reka's Vision API, the workflow sends the video for AI processing. You have full control over the output:

    • Write a custom prompt to guide the AI on what kind of clip to create
    • Choose whether to include captions
    • Set minimum and maximum clip duration
  3. Smart status checking - When the clips are ready, you receive a success email with your download link. As a safety feature, if the job takes too long, you'll get an error notification instead.

Getting Started is Easy

The best part? You can install this template with just one click from the n8n Templates page. No complex setup required!

After installation, you'll just need two quick things:

  • A free Reka AI API key (get yours from Reka)
  • A Gmail account (or use any email provider you like)

That's it! The template comes ready to use. Simply add your YouTube channel RSS feed, connect your API key, and you're ready to start generating clips automatically. The whole setup takes just a few minutes.

If you run into any questions or want to share what you've built, join the Reka Discord community. I'd love to hear how you're using this template!

Show Me

In this short  video I show you how to get that template into your n8n and how to configure it.

Happy clipping!

Writing My First Custom n8n Node: A Step-by-Step Guide

Recently, I decided to create a custom node for n8n, the workflow automation tool I've been using. I'm not an expert in Node.js development, but I wanted to understand how n8n nodes work under the hood. This blog post shares my journey and the steps that actually worked for me.

French version here

Why I Did This

Before starting this project, I was curious about how n8n nodes are built. The best way to learn something is by doing it, so I decided to create a simple custom node following n8n's official tutorial. Now that I understand the basics, I'm planning to build a more complex node featuring AI Vision capabilities, but that's for another blog post!

The Challenge

I started with the official n8n tutorial: Build a declarative-style node. While the tutorial is well-written, I ran into some issues along the way. The steps didn't work exactly as described, so I had to figure out what was missing. This post documents what actually worked for me, in case you're facing similar challenges. I already have an n8n instance running in a container. In Step 8, I'll explain how I run a second instance for development purposes.

Prerequisites

Before you start, you'll need:

  • Node.js and npm - I used Node.js version 24.12.0
  • Basic understanding of JavaScript/TypeScript - you don't need to be an expert

Step 1: Fixing the Missing Prerequisites

I didn't have Node.js installed on my machine, so my first step was getting that sorted out. Instead of installing Node.js directly, I used nvm (Node Version Manager), which makes it easy to manage different Node.js versions. Installation details are available on the nvm GitHub repository. Once nvm was set up, I installed Node.js version 24.12.0.

Most of the time, I use VS Code as my code editor. I created a new profile and used the template for Node.js development to get the right extensions and settings.

Step 2: Cloning the Starter Repository

n8n provides a n8n-nodes-starter on GitHub that includes all the basic files and dependencies you need. You can clone it or use it as a template for your own project. Since this was just a "learning exercise" for me, I cloned the repository directly:

git clone https://github.com/n8n-io/n8n-nodes-starter
cd n8n-nodes-starter

Step 3: Getting Started with the Tutorial

I won't repeat the entire tutorial here; it's clear enough, but I'll highlight some details along the way that I found useful.

The tutorial makes you create a "NasaPics" node and provides a logo for it. It's great, but I suggest you use your own logo images and have light and dark versions. Add both images in a new folder icons (same level as nodes and the credentials folder). Having two versions of the logo will make your node look better, whatever theme the user is using in n8n (light or dark). The tutorial only adds the logo in NasaPics.node.ts, but I found that adding it also in the credentials file NasaPicsApi.credentials.ts makes the node look more consistent.

Replace or add the logo line with this, and add Icon to the import statement at the top of the file:

icon: Icon = { light: 'file:MyLogo-dark.svg', dark: 'file:MyLogo-light.svg' };

Note: the darker logo should be used in light mode, and vice versa.

Step 4: Following the Tutorial (With Adjustments)

Here's where things got interesting. I followed the official tutorial to create the node files, but I had to make some adjustments that weren't mentioned in the documentation.

Adjustment 1: Making the Node Usable as a Tool

In the NasaPics.node.ts file, I added this line just before the properties array:

requestDefaults: {
      baseURL: 'https://api.nasa.gov',
      headers: {
         Accept: 'application/json',
         'Content-Type': 'application/json',
      },
   },
   usableAsTool: true, // <-- Added this line
   properties: [
      // Resources and operations will go here

This setting allows the node to be used as a tool within n8n workflows and also fixes warnings from the lint tool.

Adjustment 2: Securing the API Key Field

In the NasaPicsApi.credentials.ts file, I added a typeOptions to make the API key field a password field. This ensures the API key is hidden when users enter it, which is a security best practice.

properties: INodeProperties[] = [
   {
      displayName: 'API Key',
      name: 'apiKey',
      type: 'string',
      typeOptions: { password: true }, // <-- Added this line
      default: '',
   },
];

A Note on Errors

I noticed there were some other errors showing up in the credentials file. If you read the error message, you'll see that it's complaining about missing test properties. To fix this, I added a test property at the end of the class that implements ICredentialTestRequest. I also added the interface import at the top of the file.

authenticate: IAuthenticateGeneric = {
   type: 'generic',
   properties: {
      qs: {
         api_key: '={{$credentials.apiKey}}',
      },
   },
};

// Add this at the end of the class
test: ICredentialTestRequest = {
   request: {
      baseURL: 'https://api.nasa.gov/',
      url: '/user',
      method: 'GET',
   },
};

Step 5: Building and Linking the Package

Once I had all my files ready, it was time to build the node. From the root of my node project folder, I ran:

npm i
npm run build
npm link

During the build process, pay attention to the package name that gets generated. In my case, it was n8n-nodes-nasapics. You'll need this name in the next steps.

> n8n-nodes-nasapics@0.1.0 build
> n8n-node build

┌   n8n-node build 
│
◓  Building TypeScript files│
◇  TypeScript build successful
│
◇  Copied static files
│
└  ✓ Build successful

Step 6: Setting Up the n8n Custom Folder

n8n looks for custom nodes in a specific location: ~/.n8n/custom/. If this folder doesn't exist, you need to create it:

mkdir -p ~/.n8n/custom
cd ~/.n8n/custom

Then initialize a new npm package in this folder: run npm init and press Enter to accept all the defaults.

Step 7: Linking Your Node to n8n

Now comes the magic part - linking your custom node so n8n can find it. Replace n8n-nodes-nasapics with whatever your package name is. From the ~/.n8n/custom folder, run:

npm link n8n-nodes-nasapics

Step 8: Running n8n

This is where my setup differs from the standard tutorial. As mentioned at the beginning, I already have an instance of n8n running in a container and didn't want to install it. So I decided to run a second container using a different port. Here's the command I used:

docker run -d --name n8n-DEV -p 5680:5678 \
  -e N8N_COMMUNITY_PACKAGES_ENABLED=true \
  -v ~/.n8n/custom/node_modules/n8n-nodes-nasapics:/home/node/.n8n/custom/node_modules/n8n-nodes-nasapics \
  n8nio/n8n

Let me break down what this command does:

  • -d: Runs the container in detached mode (in the background)
  • --name n8n-DEV: Names the container for easy reference
  • -p 5680:5678: Maps port 5678 from the container to port 5680 on my machine so it doesn't conflict with my existing n8n instance
  • -e N8N_COMMUNITY_PACKAGES_ENABLED=true: Enables community packages — you need this to use custom nodes
  • -v: Mounts my custom node folder into the container, which lets me try my custom node without having to publish it.
  • n8nio/n8n: The official n8n container image

If you're running n8n directly on your machine (not in a container), you can simply start it.

Step 9: Testing Your Node

Once n8n-DEV is running, open your browser and navigate to it. Create a new workflow and search for your node. In my case, I searched for "NasaPics" and my custom node appeared!

To test it:

  1. Add your node to the workflow
  2. Configure the credentials with a NASA API key (you can get one for free at api.nasa.gov)
  3. Execute the node
  4. Check if the data is retrieved correctly

Updating Your Node

During development, you'll likely need to make changes to your code (aka node). Once done, you have to rebuild npm run build and restart the n8n container docker restart n8n-DEV to see the changes.

What's Next?

Now that I understand the basics of building custom n8n nodes, I'm ready to tackle something more ambitious. My next project will be creating a node that uses AI Vision capabilities. Spoiler alert: It's done and I'll be sharing the details in an upcoming blog post!

If you're interested in creating your own custom nodes, I encourage you to give it a try. Start with something simple, like I did, and build from there. Don't be afraid to experiment and make mistakes - that's how we learn!

Resources

From Hours to Minutes: AI That Finds Tech Events for You

TL;DR

I built an AI research agent that actually browses the live web and finds tech events, no search loops, no retry logic, no hallucinations. Just ask a question and get structured JSON back with the reasoning steps included. The secret? An API that handles multi-step research automatically. Built with .NET/Blazor in a weekend. Watch the video | Get the code | Free API key
(version française)

Happy New Year! I wanted to share something I recently presented at the AI Agents Conference 2025: how to build intelligent research assistants that can search the live web and return structured, reliable results.

Coming back from the holidays, I'm reminded of a universal problem: information overload. Whether it's finding relevant tech conferences, catching up on industry news, or wading through piles of documentation that accumulated during time off, we all need tools that can quickly search and synthesize information for us. That's what Reka Research does, it's an agentic AI that browses the web (or your private documents), answers complex questions, and turns hours of research into minutes. I built a practical demo to show this in action: an Event Finder that searches the live internet for upcoming tech conferences.

The full presentation is available on YouTube if you want to follow along: How to Build Agentic Web Research Assistants

The Problem: Finding Events Isn't Just a Simple Search

Let me paint a picture. You want to find upcoming tech conferences about AI in your area. You need specific information: the event name, start and end dates, location, and most importantly, the registration URL.

A simple web search or basic LLM query falls short because:

  • You might get outdated information
  • The first search result rarely contains all required details
  • You need to cross-reference multiple sources
  • Without structure, the data is hard to use in an application

This is where Reka's Research API shines. It doesn't just search, it reasons through multiple steps, aggregates information, and returns structured, grounded results.

Event finder interface

The Solution: Multi-Step Research That Actually Works

The core innovation here is multi-step grounding. Instead of making a single query and hoping for the best, the Research API acts like a diligent human researcher:

  1. It makes an initial search based on your query
  2. Checks what information is missing
  3. Performs additional targeted searches
  4. Aggregates and validates the data
  5. Returns a complete, structured response

As a developer, you simply send your question, and the API handles the complex iteration. No need to build your own search loops or retry logic.

How It Works: The Developer Experience

Here's what surprised me most: the simplicity. You define your data structure, ask a question, and the API handles all the complex research orchestration. No retry logic, no search loop management.

The key is structured output. Instead of parsing messy text, you tell the API exactly what JSON schema you want:

public class TechEvent
{
    public string? Name { get; set; }
    public DateTime? StartDate { get; set; }
    public DateTime? EndDate { get; set; }
    public string? City { get; set; }
    public string? Country { get; set; }
    public string? Url { get; set; }
}

Then you send your query with the schema, and it returns perfectly structured data every time. The API uses OpenAI-compatible format, so if you've worked with ChatGPT's API, this feels instantly familiar.

The real magic? You also get back the reasoning steps, the actual web searches it performed and how it arrived at the answer. Perfect for debugging and understanding the agent's thought process.

I walk through the complete implementation, including domain filtering, location-aware search, and handling the async research calls in the video. The full source code is on GitHub if you want to dive deeper.


Try It Yourself

The complete source code is on GitHub. Clone it, grab a free API key, and you'll have it running in under 5 minutes.

I'm curious what you'll build with this. Research agents that monitor news? Product comparison tools? Documentation synthesizers? The API works for any web research task. If you build something, tag me.  I'd love to see it.

Happy New Year! 🎉

Reading Notes #676

This #rd explores practical insights on leveraging GitHub Copilot for enhanced .NET testing, the rise of AI-driven documentation solutions, and the importance of security in coding agents. From dissecting Docker’s MCP servers to debating the merits of Minimal APIs, we cover a mix of .NET updates, developer workflows, and emerging best practices. Whether you’re refining build processes, optimizing codebases, or staying ahead of security trends, these notes offer a curated selection of ideas to spark your next project or refactor.



Let’s unpack what’s new and impactful in tech!

AI

Programming

~frank


Ask AI from Anywhere: No GUI, No Heavy Clients, No Friction

Ever wished you could ask AI from anywhere without needing an interface? Imagine just typing ? and your question in any terminal the moment it pops into your head, and getting the answer right away! In this post, I explain how I wrote a tiny shell script that turns this idea into reality, transforming the terminal into a universal AI client. You can query Reka, OpenAI, or a local Ollama model from any editor, tab, or pipeline—no GUI, no heavy clients, no friction.

Small, lightweight, and surprisingly powerful: once you make it part of your workflow, it becomes indispensable.

💡 All the code scripts are available at: https://github.com/reka-ai/terminal-tools


The Core Idea

There is almost always a terminal within reach—embedded in your editor, sitting in a spare tab, or already where you live while building, debugging, and piping data around. So why break your flow to open a separate chat UI? I wanted to just type a single character (?) plus my question and get an answer right there. No window hopping. No heavy client.

How It Works

The trick is delightfully small: send a single JSON POST request to whichever AI provider you feel like (Reka, OpenAI, Ollama locally, etc.):

# Example: Reka
curl https://api.reka.ai/v1/chat
     -H "X-Api-Key: <API_KEY>" 
     -d {
          "messages": [
            {
              "role": "user",
              "content": "What is the origin of thanksgiving?"
            }
          ],
          "model": "reka-core",
          "stream": false
        }
# Example: Ollama local
curl http://127.0.0.1:11434/api/chat 
-d {  
      "model": "llama3",   
      "messages": [
        {
          "role": "user", 
          "content": "What is the origin of thanksgiving?"
        }], 
      "stream": false
    }

Once we get the response, we extract the answer field from it. A thin shell wrapper turns that into a universal “ask” verb for your terminal. Add a short alias (?) and you have the most minimalist AI client imaginable.

Let's go into the details

Let me walk you through the core script step-by-step using reka-chat.sh, so you can customize it the way you like. Maybe this is a good moment to mention that Reka has a free tier that's more than enough for this. Go grab your key—after all, it's free!

The script (reka-chat.sh) does four things:

  1. Captures your question
  2. Loads an API key from ~/.config/reka/api_key
  3. Sends a JSON payload to the chat endpoint with curl.
  4. Extracts the answer using jq for clean plain text.

1. Capture Your Question

This part of the script is a pure laziness hack. I wanted to save keystrokes by not requiring quotes when passing a question as an argument. So ? What is 32C in F works just as well as ? "What is 32C in F".

if [ $# -eq 0 ]; then
    if [ ! -t 0 ]; then
        QUERY="$(cat)"
    else
        exit 1
    fi
else
    QUERY="$*"
fi

2. Load Your API Key

If you're running Ollama locally you don't need any key, but for all other AI providers you do. I store mine in a locked-down file at ~/.config/reka/api_key, then read and trim trailing whitespace like this:

API_KEY_FILE="$HOME/.config/reka/api_key"
API_KEY=$(cat "$API_KEY_FILE" | tr -d '[:space:]')

3. Send The JSON Payload

Building the JSON payload is the heart of the script, including the API_ENDPOINT, API_KEY, and obviously our QUERY. Here’s how I do it for Reka:

RESPONSE=$(curl -s -X POST "$API_ENDPOINT" \
     -H "X-Api-Key: $API_KEY" \
     -H "Content-Type: application/json" \
     -d "{
  \"messages\": [
    {
      \"role\": \"user\",
      \"content\": $(echo "$QUERY" | jq -R -s .)
    }
  ],
  \"model\": \"reka-core\",
  \"stream\": false
}")

4. Extract The Answer

Finally, we parse the JSON response with jq to pull out just the answer text. If jq isn't installed we display the raw response, but a formatted answer is much nicer. If you are customizing for another provider, you may need to adjust the JSON path here. You can add echo "$RESPONSE" >> data_sample.json to the script to log raw responses for tinkering.

With Reka, the response look like this:

{
    "id": "cb7c371b-3a7b-48d2-829d-70ffacf565c6",
    "model": "reka-core",
    "usage": {
        "input_tokens": 16,
        "output_tokens": 460,
        "reasoning_tokens": 0
    },
    "responses": [
        {
            "finish_reason": "stop",
            "message": {
                "role": "assistant",
                "content": " The origin of Thanksgiving ..."
            }
        }
    ]
}
The value we are looking for and want to display is the `content` field inside `responses[0].message`. Using `jq`, we do:
echo "$RESPONSE" | jq -r '.responses[0].message.content // .error // "Error: Unexpected response format"'

Putting It All Together

Now that we have the script, make it executable with chmod +x reka-chat.sh, and let's add an alias to your shell config to make it super easy to use. Add one line to your .zshrc or .bashrc that looks like this:

alias \\?=\"$REKA_CHAT_SCRIPT\"

Because ? is a special character in the shell, we escape it with a backslash. After adding this line, reload your shell configuration with source ~/.zshrc or source ~/.bashrc, and you are all set!

The Result

Now you can ask questions directly from your terminal. Wanna know what is origin of Thanksgiving, ask it like this:

? What is the origin of Thanksgiving

And if you want to keep the quotes, please you do you!

Extra: Web research

I couldn't stop there! Reka also supports web research, which means it can fetch and read web pages to provide more informed answers. Following the same pattern described previously, I wrote a similar script called reka-research.sh that sends a request to Reka's research endpoint. This obviously takes a bit more time to answer, as it's making different web queries and processing them, but the results are often worth the wait—and they are up to date! I used the alias ?? for this one.

On the GitHub repository, you can find both scripts (reka-chat.sh and reka-research.sh) along with a script to create the aliases automatically. Feel free to customize them to fit your workflow and preferred AI provider. Enjoy the newfound superpower of instant AI access right from your terminal!

What's Next?

With this setup, the possibilities are endless. Reka supports questions related to audio and video, which could be interesting to explore next. The project is open source, so feel free to contribute or suggest improvements. You can also join the Reka community on Discord to share your experiences and learn from others.


Resources




Check-In Doc MCP Server: A Handy Way to Search Only the Docs You Trust

Ever wished you could ask a question and have the answer come only from a handful of trusted documentation sites—no random blogs, no stale forum posts? That’s exactly what the Check-In Doc MCP Server does. It’s a lightweight Model Context Protocol (MCP) server you can run locally (or host) to funnel questions to selected documentation domains and get a clean AI-generated answer back.

What It Is

The project (GitHub: https://github.com/fboucher/check-in-doc-mcp) is a Dockerized MCP server that:

  • Accepts a user question.
  • Calls the Reka AI Research API with constraints (only allowed domains).
  • Returns a synthesized answer based on live documentation retrieval.

You control which sites are searchable by passing a comma‑separated list of domains (e.g. docs.reka.ai,docs.github.com). That keeps > results focused, reliable, and relevant.

What Is the Reka AI Research API?

Reka AI’s Research API lets you blend language model reasoning with targeted, on‑the‑fly web/document retrieval. Instead of a model hallucinating an answer from static training data, it can:

  • Perform limited domain‑scoped web searches.
  • Pull fresh snippets.
  • Integrate them into a structured response.

In this project, we use the research feature with a web_search block specifying:

  • allowed_domains: Only the documentation sites you trust.
  • max_uses: Caps how many retrieval calls it makes per query (controls cost & latency).

Details used here:

  • Model: reka-flash-research
  • Endpoint: http://api.reka.ai/v1/chat/completions
  • Auth: Bearer API key (generated from the Reka dashboard: https://link.reka.ai/free)

How It Works Internally

The core logic lives in ResearchService (src/Domain/ResearchService.cs). Simplified flow:

  1. Initialization
    Stores the API key + array of allowed domains, sets model & endpoint, logs a safe startup message.

  2. Build Request Payload
    The CheckInDoc(string question) method creates a JSON payload:

    var requestPayload = new {
      model,
      messages = new[] { new { role = "user", content = question } },
      research = new {
        web_search = new {
          allowed_domains = allowedDomains,
          max_uses = 4
        }
      }
    };
    
  3. Send Request
    Creates a HttpRequestMessage (POST), adds Authorization: Bearer <APIKEY>, sends JSON to Reka.

  4. Parse Response
    Deserializes into a RekaResponse domain object, returns the first answer string.

Adding It to VS Code (MCP Extension)

You can run it as a Docker-based MCP server. Two simple approaches:

Option 1: Via “Add MCP Server” UI

  1. In VS Code (with MCP extension), click Add MCP Server.
  2. Choose type: Docker image.
  3. Image name: fboucher/check-in-doc-mcp.
  4. Enter allowed domains and your Reka API key when prompted.

Option 2: Via mcp.json (Recommended)

Alternatively, you can manually configure it in your mcp.json file. This will make sure your API key isn't displayed in plain text. Add or merge this configuration:

{
  "servers": {
    "check-in-docs": {
      "type": "stdio",
      "command": "docker",
      "args": [
        "run",
        "-i",
        "--rm",
        "-e",
        "ALLOWED_DOMAINS=${input:allowed_domains}
        ",
        "-e",
        "APIKEY=${input:apikey}",
        "fboucher/check-in-doc-mcp"
      ]
    }
  },
  "inputs": [
    {
      "id": "allowed_domains",
      "type": "promptString",
      "description": "Enter the comma-separated list of documentation domains to allow (e.g. docs.reka.ai,docs.github.com):"
    },
    {
      "id": "apikey",
      "type": "promptString",
      "password": true,
      "description": "Enter your Reka Platform API key:"
    }
  ]
}

How to Use It

To use it ask to Check In Doc something or You can now use the SearchInDoc tool in your MCP-enabled environment. Just ask a question, and it will search only the specified documentation domains.

Final Thoughts

It’s intentionally simple—no giant orchestration layer. Just a clean bridge between a question, curated domains, and a research-enabled model. Sometimes that’s all you need to get focused, trustworthy answers.

If this sparks an idea, clone it and adapt away. If you improve it (citations, richer error handling, multi-turn context)—send a PR!

Watch a quick demo


Links & References

AI Vision: Turning Your Videos into Comedy Gold (or Cringe)

I've spent most of my career building software in C# and .NET, and only used Python in IoT projects. When I wanted to build a fun project—an app that uses AI to roast videos, I knew it was the perfect opportunity to finally dig into Python web development.

The question was: where do I start? I hopped into a brainstorming session with Reka's AI chat and asked about options for building web apps in Python. It mentioned Flask, and I remembered friends talking about it being lightweight and perfect for getting started. That sounded right.

In this post, I share how I built "Roast My Life," a Flask app using the Reka Vision API.

The Vision (Pun Intended)

The app needed three core things:

  1. List videos: Show me what videos are in my collection
  2. Upload videos: Let me add new ones via URL
  3. Roast a video: Send a selected video to an AI and get back some hilarious commentary

See it in action

Part 1: Getting Started Environment Setup

The first hurdle was always going to be environment setup. I'm serious about keeping my Python projects isolated, so I did the standard dance:

Before even touching dependencies, I scaffolded a super bare-bones Flask app. Then one thing I enjoy from C# is that all dependencies are brought in one shot, so I like doing the same with my python projects using requirements.txt instead of installing things ad‑hoc (pip install flask then later freezing).

Dropping that file in first means the setup snippet below is deterministic. When you run pip install -r requirements.txt, Flask spins up using the exact versions I tested with, and you won't accidentally grab a breaking major update.

Here's the shell dance that activates the virtual environment and installs everything:

cd roast_my_life/workshop
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Then came the configuration. We will need an API key and I don't want to have it hardoced so I created a .env file to store my API credentials:

API_KEY=YOUR_REKA_API_KEY
BASE_URL=https://vision-agent.api.reka.ai

To get that API key, I visited the Reka Platform and grabbed a free one. Seriously, a free key for playing with AI vision APIs? I was in.

With python app.py, I fired up the Flask development server and opened http://127.0.0.1:5000 in my browser. The UI was there, but... it was dead. Nothing worked.

Perfect. Time to build.

The Backend: Flask Routing and API Integration

Coming from ASP.NET Core's controller-based routing and Blazor, Flask's decorator-based approach felt just like home. All the code code goes in the app.py file, and each route is defined with a simple decorator. But first things first: loading configuration from the .env file using python-dotenv:

from flask import Flask, request, jsonify
import requests
import os
from dotenv import load_dotenv

app = Flask(__name__)

# Load environment variables (like appsettings.json)
load_dotenv()
api_key = os.environ.get('API_KEY')
base_url = os.environ.get('BASE_URL')

All the imports packages are the same ones that needs to be in the requirements.txt. And we retreive the API key and base URL from environment variables, just like in .NET Core.

Now, to be able to get roasted we need first to upload a video to the Reka Vision API. Here's the code—I'll go over some details after.


@app.route('/api/upload_video', methods=['POST'])
def upload_video():
    """Upload a video to Reka Vision API"""
    data = request.get_json() or {}
    video_name = data.get('video_name', '').strip()
    video_url = data.get('video_url', '').strip()
    
    if not video_name or not video_url:
        return jsonify({"error": "Both video_name and video_url are required"}), 400
    
    if not api_key:
        return jsonify({"error": "API key not configured"}), 500
    
    try:
        response = requests.post(
            f"{base_url.rstrip('/')}/videos/upload",
            headers={"X-Api-Key": api_key},
            data={
                'video_name': video_name,
                'index': 'true',  # Required: tells Reka to process the video
                'video_url': video_url
            },
            timeout=30
        )
        
        response_data = response.json() if response.ok else {}
        
        if response.ok:
            video_id = response_data.get('video_id', 'unknown')
            return jsonify({
                "success": True,
                "video_id": video_id,
                "message": "Video uploaded successfully"
            })
        else:
            error_msg = response_data.get('error', f"HTTP {response.status_code}")
            return jsonify({"success": False, "error": error_msg}), response.status_code
            
    except requests.Timeout:
        return jsonify({"success": False, "error": "Request timed out"}), 504
    except Exception as e:
        return jsonify({"success": False, "error": f"Upload failed: {str(e)}"}), 500

Once the information from the frontend is validated we make a POST request to the Reka Vision API's /videos/upload endpoint. The parameters are sent as form data, and we include the API key in the headers for authentication. Here I was using URLs to upload videos, but you can also upload local files by adjusting the request accordingly. As you can see, it's pretty straightforward, and the documentation from Reka made it easy to understand what was needed.

The Magic: Sending Roast Requests to Reka Vision API

Here's where things get interesting. Once a video is uploaded, we can ask the AI to analyze it and generate content. The Reka Vision API supports conversational queries about video content:

def call_reka_vision_qa(video_id: str) -> Dict[str, Any]:
    """Call the Reka Video QA API to generate a roast"""
    
    headers = {'X-Api-Key': api_key} if api_key else {}
    
    payload = {
        "video_id": video_id,
        "messages": [
            {
                "role": "user",
                "content": "Write a funny and gentle roast about the person, or the voice in this video. Reply in markdown format."
            }
        ]
    }
    
    try:
        resp = requests.post(
            f"{base_url}/qa/chat",
            headers=headers,
            json=payload,
            timeout=30
        )
        
        data = resp.json() if resp.ok else {"error": f"HTTP {resp.status_code}"}
        
        if not resp.ok and 'error' not in data:
            data['error'] = f"HTTP {resp.status_code} calling chat endpoint"
        
        return data
        
    except requests.Timeout:
        return {"error": "Request to chat API timed out"}
    except Exception as e:
        return {"error": f"Chat API call failed: {e}"}

Here we pass the video ID and a prompt asking for a "funny and gentle roast." The API responds with AI-generated content, which we can then send back to the frontend for display. I try to give more "freedom" to the AI by asking it to reply in markdown format, which makes the output more engaging.

Try It Yourself!

The complete project is available on GitHub: reka-ai/api-examples-python

What Makes Reka Vision API So nice to use

What really stood out to me was how approachable the Reka Vision API is. You don't need any special SDK—just the requests library making standard HTTP calls. And honestly, it doesn't matter what language you're used to; an HTTP call is pretty much always simple to do. Whether you're coming from .NET, Python, JavaScript, or anything else, you're just sending JSON and getting JSON back.

Authentication is refreshingly straightforward: just pop your API key in the header and you're good to go. No complex SDKs, no multi-step authentication flows, no wrestling with binary data streams. The conversational interface lets you ask questions in natural language, and you get back structured JSON responses with clear fields.

One thing worth noting: in this example, the videos are pre-uploaded and indexed, which means the responses come back fast. But here's the impressive part—the AI actually looks at the video content. It's not just reading a transcript or metadata; it's genuinely analyzing the visual elements. That's what makes the roasts so spot-on and contextual.

Final Thoughts

The Reka Vision API itself deserves credit for making video AI accessible. No complicated SDKs, no multi-GB model downloads, no GPU requirements. Just simple HTTP requests and powerful AI capabilities. I'm not saying I'm switching to Python full-time, but expect to see me sharing more Python projects in the future!

References and Resources


Reading Notes #670

Dives into the intersection of AI and development, exploring tools like GitHub Copilot’s AGENTS.md and the MCP Toolkit for automations, alongside .NET 10.0’s performance gains and OpenAI’s recent updates. Whether you’re optimizing serverless APIs with AWS Lambda or mastering the Web Animation API, this post highlights breakthroughs in code efficiency, model customization, and cloud innovation. Dive into these thought-provoking reads to stay ahead in a rapidly changing world.




Have a nice week!

Suggestion of the week

Cloud

AI

Programming

  • How .NET 10.0 boosted JSON Schema performance by 18% (Matthew Adams) - Another example of the gain in performance just by upgrading to the latest .NET version.

  • The Web Animation API (Christian Nwamba) - It's the first time I've read about this web animation API, pretty cool even if we need to be careful, I think that precision offers could be very interesting for some animations.


Sharing my Reading Notes is a habit I started a long time ago, where I share a list of all the articles, blog posts, and books that catch my interest during the week.

If you have interesting content, share it!

~frank



Reading Notes #667

This week post explores the intersection of AI, cloud, and DevOps, featuring updates on Microsoft’s Logic Apps integration, practical .NET tools for system automation, and strategies to enhance documentation for AI-driven workflows. Whether you’re refining enterprise security practices with NuGet’s Trusted Publishing or diving into the ethical nuances of AI through vector databases, this post offers a blend of technical deep dives and thought-provoking discussions. Don’t miss the podcast highlights, from DevOps innovation to the business impact of employee well-being, perfect for developers, architects, and curious minds alike. Let’s connect the dots in a world where code, creativity, and collaboration drive progress.






Programming

AI

DevOps

Podcasts

Miscellaneous

Sharing my Reading Notes is a habit I started a long time ago, where I share a list of all the articles, blog posts, and books that catch my interest during the week.

If you have interesting content, share it!

~frank

Using AI with .NET 10 Scripts: What Worked, What Didn’t, and Lessons Learned

I wanted to kick the tires on the upcoming .NET 10 C# script experience and see how far I could get calling Reka’s Research LLM from a single file, no project scaffolding, no .csproj. This isn’t a benchmark; it’s a practical tour to compare ergonomics, setup, and the little gotchas you hit along the way. I’ll share what worked, what didn’t, and a few notes you might find useful if you try the same.

All the sample code (and a bit more) is here: reka-ai/api-examples-dotnet · csharp10-script. The scripts run a small “top 3 restaurants” prompt so you can validate everything quickly.

We’ll make the same request in three ways:

  1. OpenAI SDK
  2. Microsoft.Extensions.AI for OpenAI
  3. Raw HttpClient

What you need

The C# "script" feature used below ships with the upcoming .NET 10 and is currently available in preview. If you prefer not to install a preview SDK, you can run everything inside the provided Dev Container or on GitHub Codespaces. I include a .devcontainer folder with everything set up in the repo.

Set up your API key

We are talking about APIs here, so of course, you need an API key. The good news is that it's free to sign up with Reka and get one! It's a 2-click process, more details in the repo. The API key is then stored in a .env file, and each script loads environment variables using DotNetEnv.Env.Load(), so your key is picked up automatically. I went this way instead of using dotnet user-secrets because I thought it would be the way it would be done in a CI/CD pipeline or a quick script.

Run the demos

From the csharp10-script folder, run any of these scripts. Each line is an alternative

dotnet run 1-try-reka-openai.cs
dotnet run 2-try-reka-ms-ext.cs
dotnet run 3-try-reka-http.cs

You should see a short list of restaurant suggestions.

Propmt Result: 3 restaurants


OpenAI SDK with a custom endpoint

Reka's API is using the OpenAI format; therefore, I thought of using the NuGet package OpenAI. To reference a package in a script, you use the #:package [package name]@[package version] directive at the top of the file. Here is an example:

#:package OpenAI@2.3.0

// ...

var baseUrl = "http://api.reka.ai/v1";

var openAiClient = new OpenAIClient(new ApiKeyCredential(REKA_API_KEY), new OpenAIClientOptions
{
    Endpoint = new Uri(baseUrl)
});

var client = openAiClient.GetChatClient("reka-flash-research");

string prompt = "Give me 3 nice, not crazy expensive, restaurants for a romantic dinner in Montreal";

var completion = await client.CompleteChatAsync(
    new List<ChatMessage>
    {
        new UserChatMessage(prompt)
    }
);

var generatedText = completion.Value.Content[0].Text;

Console.WriteLine($" Result: \n{generatedText}");

The rest of the code is more straightforward. You create a chat client, specify the Reka API URL, select the model, and then you send a prompt. And it works just as expected. However, not everything was perfect, but before I share more about that part, let's talk about Microsoft.Extensions.AI.

Microsoft Extensions AI for OpenAI

Another common way to use LLM in .NET is to use one ot the Microsoft.Extensions.AI NuGet package. In our case Microsoft.Extensions.AI.OpenAI was used.

#:package Microsoft.Extensions.AI.OpenAI@9.8.0-preview.1.25412.6

// ....

var baseUrl = "http://api.reka.ai/v1";

IChatClient client = new ChatClient("reka-flash-research", new ApiKeyCredential(REKA_API_KEY), new OpenAIClientOptions
{
    Endpoint = new Uri(baseUrl)
}).AsIChatClient();

string prompt = "Give me 3 nice, not crazy expensive, restaurants for a romantic dinner in Montreal";

Console.WriteLine(await client.GetResponseAsync(prompt));

As you can see, the code is very similar. Create a chat client, set the URL, the model, and add your prompt, and it works just as well.

That's two ways to use Reka API with different SDKs, but maybe you would prefer to go "SDKless", let's see how to do that.

Raw HttpClient calling the REST API

Without any SDK to help, there is a bit more line of code to write, but it's still pretty straightforward. Let's see the code:

using var httpClient = new HttpClient();

var baseUrl = "http://api.reka.ai/v1/chat/completions";

var requestPayload = new
{
    model = "reka-flash-research",
    messages = new[]
            {
                new
                {
                    role = "user",
                    content = "Give me 3 nice, not crazy expensive, restaurants for a romantic dinner in New York city"
                }
            }
};

using var request = new HttpRequestMessage(HttpMethod.Post, baseUrl);
request.Headers.Add("Authorization", $"Bearer {REKA_API_KEY}");
request.Content = new StringContent(jsonPayload, Encoding.UTF8, "application/json");

var response = await httpClient.SendAsync(request);

var responseContent = await response.Content.ReadAsStringAsync();

var jsonDocument = JsonDocument.Parse(responseContent);

var contentString = jsonDocument.RootElement
    .GetProperty("choices")[0]
    .GetProperty("message")
    .GetProperty("content")
    .GetString();

Console.WriteLine(contentString);

So you create an HttpClient, prepare a request with the right headers and payload, send it, get the response, and parse the JSON to extract the text. In this case, you have to know the JSON structure of the response, but it follows the OpenAI format.

What did I learn from this experiment?

I used VS Code while trying the script functionality. One thing that surprised me was that I didn't get any IntelliSense or autocompletion. I try to disable the DevKit extension and change the setting for OmniSharp, but no luck. My guess is that because it's in preview, and it will work just fine in November 2025 when .NET 10 will be released.

In this light environment, I encountered some issues where, for some reason, I couldn't use an https endpoint, so I had to use http. In the raw httpClient script, I had some errors with the Reflection that wasn't available. It could be related to the preview or something else, I didn't investigate further.

For the most part, everything worked as expected. You can use C# code to quickly execute some tasks without any project scaffolding. It's a great way to try out the Reka API and see how it works.

What's Next?

While writing those scripts, I encountered multiple issues that aren't related to .NET but more about the SDKs when trying to do more advanced functionalities like optimization of the query and formatting the response output. Since it goes beyond the scope of this post, I will share my findings in a follow-up post. Stay tuned!

Video version

Here is a video version of this post


Learn more

Reading Notes #661

This week post collects concise links and takeaways across .NET, AI, Docker, open source security, DevOps, and broader developer topics. From the .NET Conf call for content and Copilot prompts to Docker MCP tooling, container debugging tips, running .NET as WASM, and a fresh look at the 10x engineer idea.


Suggestion of the week

AI

Open Source

DevOps

Programming

Miscellaneous


Sharing my Reading Notes is a habit I started a long time ago, where I share a list of all the articles, blog posts, and books that catch my interest during the week.

If you have interesting content, share it!

~frank

Reading Notes #656

This week, we're exploring a wide range of topics, from .NET 10 previews and A/B testing to the latest in Azure development and AI. Plus, a selection of insightful podcast episodes to keep you informed and inspired.

Cloud


Programming

Open Source

  • Patrik Svensson (Patrik Svensson) - An interesting way to structure the flow that provides more detailed issues and PR with a clear purpose.

AI


Podcasts


~frank