DevFest Baku 2025 Speaker Presentations ( Software, AI, Workshop tracks)

Software Engineering Track -
Levent Kantaroğlu , Jamal
Hasanov , Amulya Bhatia

Flutter Usage in Firebase Studio
Levent Kantaroğlu

●
●
●
●
Flutter, Dart, Firebase
Firebase Studio’nun Serüveni
Firebase Studio’nun Yetkinlikleri
Firebase Studio’da Flutter ile Geliştirme

Flutter UI Toolkit
Multi-Platform Fast Development Performance

Levent Kantaroğlu
- Thanks -

Running Processes - Many and
Fast
A talk on Linux, Concurrency and Web Servers
Dr. Jamal Hasanov
School of IT and Engineering

Just printing
#include <stdio.h>
int main() {
printf("Hello, World!n");
}
a function from C (libc)
#include <stdio.h>
#include <unistd.h>
int main() {
write(STDOUT_FILENO, "Hello, World!n", 14);
}
We also could use a lower-level
function for the printing
What is this then??? The number of symbols

Just printing writing
Source: https://manpages.debian.org/unstable/manpages-dev/write.2.en.html

Writing to a file
write(STDOUT_FILENO, "Hello, Worlds!n", 14);
STDOUT_FILENO defines where the output text shall be
directed to
These are the descriptors from unistd.h:
/* Standard file descriptors. */
#define STDIN_FILENO 0 /* Standard input. */
#define STDOUT_FILENO 1 /* Standard output. */
#define STDERR_FILENO 2 /* Standard error output. */

Why a file descriptor?
fd = open(…)
read(fd)
write(fd,…)
close(fd)
D
D
D
D
D
D
D
D
Calling everything a file brings an abstraction – the same approach for
different sources
Does not matter where you write to:
Disk
Network
A buffer of an embedded device
Implementation of the abstraction is done through the drivers
In Linux everything is considered as a file

A process is a file too
A process is an instance of a running
program. Every time a program is launched or a
command is executed, a new process with a
unique ID (PID) is created.
proc is a pseudo-filesystem (not stored on disk) that
provides a view into the kernel’s internal data
structures.
procfs stands for Process File System, typically
mounted at /proc.
It is a virtual filesystem - its files and directories exist
only in memory and are generated dynamically by the
kernel.

Each process has its own directory
/proc/<PID>/ containing details such as:
Process files Purpose
/proc/<PID>/cmdline Command-line arguments
/proc/<PID>/cwd Symbolic link to the current working directory
/proc/<PID>/exe Symbolic link to the executable
/proc/<PID>/fd/ Open file descriptors
/proc/<PID>/status General process info (UIDs, memory, state, etc.)
/proc/<PID>/stat and
/proc/<PID>/statm
Numeric statistics (CPU time, memory usage, etc.)
/proc/<PID>/environ Environment variables
A process is a file too

Let’s see the files of this process!

Abstraction for communication
A pseudo-terminal, often abbreviated as PTY, is a virtual
device in Unix-like operating systems, including Linux.
It functions as a pair of virtual character devices that
establish a bidirectional communication channel
between processes.
PTY
(dev/pts/
1)
Master device (ptm or ptmx) is controlled by a
terminal emulator process or remote login server.
Anything written to the master is given to the slave
as input.
Slave device (pts) acts exactly like a traditional
hardware terminal (tty) device to the programs
running within it. Programs write their output to
the slave, which is then read by the master
process.
The SSH daemon process on the
remote server manages the
connection. It forwards the client
keystrokes received over the
network to the pty master and
reads output from the master to
send it back to the client.
The shell that starts on the remote
server and runs clients commands. It
receives input from the pty slave,
which is being fed by the SSH
daemon.

How to make two computers
talk?
Image source iStock and source

Internet as a file
Server Client

Being blocked
Read on a descriptor blocks if there’s no data
available.
fd = open(…)
read(fd)
Did somebody press a key?
Did somebody move the mouse?
Did somebody send a message over network?
The same is true for write.

Disk files are exception, since writes to disk
happen via the kernel buffer cache, not
directly.
Being blocked
fd = open(…)
write(fd,…)
Buffer
cache
The only time when writes to disk happen synchronously
is when the O_SYNC flag was specified when opening
the disk file.

Non-blocking file descriptors
A descriptor can be put in the nonblocking mode by
setting the O_NONBLOCK flag
In this case, a call on that descriptor will return
immediately, even if that request can’t be immediately
completed. The return value can be either of the
following:
an error: when the operation cannot be completed at all
a partial count: when the input or output operation can be
partially completed
the entire result: when the I/O operation could be fully
completed

Running more than we can
Context switching is the
process where the CPU saves
the state of a running task and
loads the state of another task
to switch from one to another,
enabling multitasking by
giving the illusion that
multiple processes are
running simultaneously.

Concurrency
Concurrency refers to the ability of a system to
execute multiple tasks through simultaneous
execution or time-sharing (context switching)
Improved performance by executing multiple tasks in parallel.
Better resource utilization, e.g., CPU and I/O devices are kept busy.
Scalability: can handle more clients, requests, or tasks simultaneously.
Modularity: concurrent tasks can be designed as independent components.
Fault isolation: failures in one task may not crash the entire system.
Motivations for concurrent applications:

Multi-threaded vs Multi-process
T1 T2 T3 T4
Process
Threads
MEMORY
Process
MEMORY
Process
MEMORY
Process
MEMORY
Process
MEMORY
Threads and the main process
use the same memory space
Each process uses its own
memory space

Combined model
Image credit: M. van Steen and A.S. Tanenbaum, Distributed Systems, 4th ed., distributed-systems.net, 2023.

Implementation options
Image credit: medium post
Mx1 - kernel is not aware of any of the user threads - there is only one
thread/process in kernel which is serving the user-space scheduler.
1x1 – each user thread is mapped to one kernel thread
MxN – a combination of Mx1 and 1x1: a kernel thread may serve an individual
user kernel or the scheduler

My first discovery of
Concurrency
Case 1: Advanced Payment System
Case 2: Billing System for Azeronline
“Many years later, as he faced the firing squad, Colonel
Aureliano Buendía was to remember that distant
afternoon when his father took him to discover ice”
Gabriel García Márquez
One Hundred Years of Solitude

Advanced Payment System
050123456 Search
Search number
Name & Surname
Make payments
Operation result

Advanced Payment System
Your
number
please
Your
number
please
111 222
Are you
John Smith?
Are you
Bob Adams?
Sorry,
I am Bob
Sorry,
I am John

Concurrency problems
Competition vs Cooperation
Race Condition
A resource
task task task task
update
A resource
task task task task

Concurrency solutions
A resource
A task
Mutex
A resource
A task
Semaphore
notify
A resource
A task
Monitor
Monitor
A resource
A task
Non-blocking

Concurrency Demo: one processor involved
Execution of the code
Calc in Loop 1
Calc in Loop 2
Calc in Loop 3
Calc in Loop 4
Calc in Loop 5
Calc in Loop 6
Calc in Loop 7
Calc in Loop 8
Calculate total

Execution of the code
Loop
1
Loop
2
Loop
3
Loop
4
Calculate total
Concurrency Demo: multiple processors involved
Loop
5
Loop
6
Loop
7
Loop
8

A visual demo – Image
Quantization

A practical application – handling client requests
SERVER
But we can
serve 6 of them
at a time!

A concurrency example
Apache Web Server
Control Process (Parent)
Child Process Child Process Child Process
Listener Threads Server Threads
Web Clients

Problems
Threads are generally lighter than
processes, a large number of
concurrent threads can still
consume significant memory and
CPU resources, potentially
leading to performance
degradation under heavy load or
with a high number of idle
connections.
Thread-per-connection
model can encounter
scalability limitations,
especially when handling
a massive number of
concurrent, long-lived
connections
Creating, managing, and
destroying a large number of
threads can introduce overhead,
especially under fluctuating or
bursty workloads, potentially
impacting overall performance.

C10k problem
Stated by software engineer
Dan Kegel in 1999
Stands for Concurrent 10,000
requests
The C10k problem was the
challenge of designing network
servers that could handle
10,000 concurrent client
connections

Nginx story
2002 – Russian software engineer Igor
Sysoev began developing Nginx to solve the
“C10K problem” — handling 10,000
simultaneous connections efficiently.
2004 – Nginx was publicly released as
open-source software under the BSD-like
license.
2006–2010 – It gained popularity for
its event-driven, asynchronous
architecture, outperforming Apache in
serving static content and handling high
concurrency.
2017 – Surpassed Apache in market share
for top 1000 busiest websites, marking a
significant shift in web infrastructure.

Clients
Master
Workers
CPU
Responsible for reading configuration,
creating sockets, managing signals, and
spawning workers.
Does not handle client I/O.
Each worker runs an independent
event loop
Accepts client connections and
handles requests
Each worker is single-threaded, non-
blocking, and uses I/O multiplexing
for thousands of concurrent
connections.

T1 T2 T3 T4
Process
Threads
MEMORY
Process
MEMORY
Process
MEMORY
Process
MEMORY
Process
MEMORY
Resources are naturally shared
(in the same codebase) among
the threads and process
No common memory – none of
the resources of one process is
visible outside
How to share the user requests,
sockets and other resources?

Process
MEMORY
Process
MEMORY
Process
MEMORY
Process
MEMORY
OS Kernel
socket
The descriptor of
this socket (a file) is
shared among
processes
A processes use
OS to check is
there is anything
in the socket
A socket is listening
for the requests.

I/O Multiplexing
We showed how a process handles
I/O on a single descriptor.
Often, a process might want to
handle I/O on more than one
descriptor.
I/O multiplexing is a technique
that allows a single thread to
monitor and manage multiple I/O
channels, such as network
sockets, to handle many
connections efficiently

Blocking is not a solution for sure – we cannot keep
waiting!
Non-blocking approach might not be the right approach:
When data is coming in very slowly the program will wake up
frequently and unnecessarily, which wastes CPU resources.
When data does come in, the program may not read it immediately
if it's sleeping, so the latency of the program will be poor.
Handling a large number of file descriptors with this pattern would
become cumbersome.
Solution: I/O multiplexing modes need to be considered
Problem: Requesting
multiple sources in an
asynchronous mode

I/O Multiplexing
There are several ways of multiplexing I/O on
descriptors:
non-blocking I/O - the descriptor itself is marked as
non-blocking, operations may finish partially
signal driven I/O - the process owning the
descriptor is notified when the I/O state of the
descriptor changes
polling I/O - with select or poll system calls, both of
which provide level triggered notifications about the
readiness of descriptors

I/O Multiplexing
non-blocking I/O
If a process tries to perform I/O operations
very frequently, the process has to
continuously be retrying operations that
returned an error to check if any descriptors
are ready.
Such busy-waiting in a tight loop could lead
to burning CPU cycles.
All file descriptors are set to non-blocking mode

I/O Multiplexing
Signals are expensive to catch, rendering signal driven I/O
impractical for cases where a large amount of I/O is
performed.
Kernel tracks a list of descriptors and sends the process a
signal every time any of the descriptors become ready for
I/O.
signal driven I/O
The kernel is instructed to send the process a signal when I/O
can be performed on any of the descriptors.

I/O Multiplexing
polling I/O
The process uses the level triggered mechanism to request
the kernel by a system call which descriptors are capable of
performing I/O.
There's a few I/O multiplexing system calls: select (defined by POSIX),
the epoll on Linux, and the kqueue on BSD.
These all work fundamentally the same way: they let the kernel know
what events (typically read events and write events) are of interest on
a set of file descriptors, and then they block until something of interest
happens.

epoll
epoll is a Linux kernel system
call for an I/O event notification
mechanism
epoll monitors multiple file
descriptors to see whether I/O is
possible on any of them.
epoll uses a red–black tree (RB-
tree) data structure to keep
track of all file descriptors that
are currently being monitored

Edge-Triggered Polling (ET)
Events are delivered only when the state transitions (from not-ready →
ready).
Example: you get a read event only when new data arrives, not while data remains
unconsumed.
Requires reading/writing until EAGAIN to avoid missing future events.
More efficient because it avoids repeated notifications.
More complex to implement correctly - mismanagement may lead to
stalled I/O.

Level-Triggered Polling (LT)
The default behavior in many polling APIs.
A file descriptor is reported as “ready” as long as the condition remains
true
(e.g., there is unread data in the buffer).
The application may receive repeated readiness notifications until the
condition is cleared.
Safer and simpler but may cause unnecessary repeated wake-ups.

Nginx - Full Scenario
The master reads nginx.conf, creates
listening sockets, binds to ports (e.g., 80,
443), and sets them as non-blocking.
Then it forks the worker processes.
Each worker inherits the listening socket file
descriptors from the master.
During its startup routine each worker:
Initializes its event loop subsystem.
Calls epoll_create().
Stores the returned epoll file
descriptor for future event registration.

Context Switch and Pinning
Context Switching in Multiprocessing
The CPU rapidly switches between processes,
pausing one and resuming another so they can
“share” the same core.
Each switch has a cost: the CPU must save the
current state, load another one, and refill caches,
which can slow systems down under heavy load.
Nginx ties each worker process to a specific CPU
core, avoiding unnecessary movement between
cores.
This improves performance and lowers latency,
because the worker keeps its core’s warm caches
and avoids extra context-switch overhead.

Customer Service Analogy - Apache
Only 4 computers available
Hired 25 people to handle customer requests

Customer Service Analogy - NGINX
Only 4 computers available
Only 4 employees – each with a dedicated PC
Example: one customer may have 10-
second work, while other may require
an approve from the head office or
credit system.
pick the next task when they have a
time (a previous task is waiting or
something)

Another asynchronous event loop model – Node.js

Concurrency in Functional
Programming languages
Immutability simplifies concurrency
No shared mutable state: fewer race
conditions than in imperative code.
Pure functions parallelize naturally
No side effects: tasks run safely on different
cores without coordination.
Actor-based concurrency
Many FP languages use message-passing
(Erlang, Elixir, Scala/Akka) instead of shared
memory.
Higher-level concurrency primitives
Futures, promises, parallel map, and lazy
streams reduce the need for locks.
Icon source

Recommendations
Learn Linux
It is a monolithic OS –
check for the alternatives
What will be the OSs in
quantum computers?
1
Learn Concurrency
programming
2
Learn Low-level
programming
You cannot get ready for
Quantum if you don’t
understand hardware well
3

References
What is epoll? A Medium post
Non-blocking I/O – A Medium post
Blocking I/O, Nonblocking I/O, And Epoll (link)

1
Hey all,
I’m Amulya Bhatia!
Chief Architect Gen AI SME GDE
They/Them
My ADK Book LinkedIn

Firebase Studio is an
integrated and extensible
agentic workspace to
build, run, and manage
web apps, cross-platform
mobile apps, backend
services, and more.

Web-based
Full cloud VM
Deep Google
Integrations
Preconfigured
Environments Live preview
Designed for
collaboration
Workspace sharing available
with more features planned.
AI Assistance
Built on VS Code
Yourworkspaceisa URL,
installable as a PWA.
Supports most toolchains.
Startfromcommon starting
points or customize your own.
Streamline app dev workflows.
Across code, test,
debugging, etc.
Realdevice,emulator,
or hosted IFRAME.
World-class code editing.
super_g_logo
supe

www.firebasestudio.com
Multimodal /
Natural Language

Create a
blank project

Templates

Open an
Existing Project

Agentic
Experiences
Throughout
Our advanced coding capabilities
are enhanced with agentic
experiences that take complex (or
boring) actions on behalf of you.
Whether the changes need to
happen across a section of code, a
single file, or an entire code base,
Gemini will understand your intent
and accomplish the task.

AI-centric View Code-centric View
We want to ensure you all have a choice
in using as-much or as-little AI as you want
when building your apps

Share and
Collaborate
in Real-Time
Not only can you share thedeployed
link, you can share the entire
workspace with a URL.
This means you can collaborate in
real-time within the same Firebase
Studio environment, and then push
updates instantly.

App Prototyping Agent
To build our full stackweb
application, we can startwith
a natural language prompt and
it will create a PRD forusto
review.
After we modify the proposal
as needed we can generate the
app and iterate in chat to
update and rebuild the
application.

We can prototype quickly to get a
Next.js app with Genkit for agentive
features that is connected to Firebase!

Quickly Deploy to Firebase
We can quickly deploy our
application we created to
Firebase App Hosting.
The wizard will guide you on
picking the correct project, billing
account and kick of the new or
updated deployment.

Firebase Studio supports existing
codebases and any stack that canbe
installed with Nix (120k+ packages)

Exploring the Code
Every workspace is backed by
full IDE that we can edit the
generated code.
a
There is a full VM backing
Firebase Studio so you can run
commands that would usually
fail in the browser.

Editing the code with Gemini
In the IDE Code view of
Firebase Studio we can still
have the full power of Gemini
in the workspace.
Gemini can read and write
files and run terminal
commands with the full context
of the project, recently
opened files and any
attachments sent.

PropProieptarireyt a+r Cy oandfidCeo
a lightweight, powerful, and accessible CLI
tool that integrates cutting-edge AI
directly into the terminal.

Proprietary + Confidential
Gemini CLI
Code and Files Invoke tools Coordinate with
other apps
Coordinate with other
applications like VS Code
or Chrome, performing
actions or gathering
context
Comprehensive
Context
Including project files,
data, and potentially even
screen sharing – to
provide the most relevant
and effective assistance
Generate code and
manage files (the core CLI
function)
Intelligently invoke other
developer tools, MCP
support, to manage local
development, run tests,
interact with cloud
services, etc

Gemini CLI Architecture
User via Terminal packages/cli packages/core
Tools / MCP Servers
Gemini/Code Assist API
User sends
tasks/prompts via
terminal and receives
outputs and requests
(to perform actions via
tools)
Sends user
requests/confirmations
to the core and receives
tools details and final
outputs from the core
Sends prompts/tools
info to Gemini
Executes Tools and
receives results
Local Tools (file, shell, web)
Local and Remote MCP
Servers

Interactive and File/Shell
Integration
/ commands
@ context
! shell

Powerful Built-In Tools
File system tools (read_file, write_file,
list_directory, search_file_content),
Shell tool (run_shell_command)
Web tools (web_fetch,
google_web_search)

Hierarchical Context &
Memor y
Project-specificandglobal instructions
to the AI using GEMINI.md files

Secure Sandboxing
Run in isolated environment using
technologies like Docker, Podman,
or macOS's native Seatbelt
functionality

Custom Commands and
MCP
Createcustom slashcommandsfor
frequently used prompts.
Model Context Protocol (MCP) servers

32
Gen AI Toolbox benefits
Bettermanageability,observability,andsecurityfor your gen AI agents
Reduced boilerplate code to simplify tool development
Efficient connection pooling and optimized connectors for databases
Configuration-driven approach enables deployment without interruption
Zero downtime
Enhanced security
Better performance
Simplified development
End-to-end observability Integrated with Google Cloud monitoring and tracing
Provides simple patterns for integrating user authentication

Toolbox flow
1 Access Database
� �
Specify URI
2
Deﬁne tools
Load Tools
3
Invoke Tool

tools.yaml
● Configuration (tools.yaml)
○
○
○
○
Usersdefineseveral resources in a
file
3 sections: Sources, Tools, Toolset
Each is defined in a map of name →
object definition
Toolbox loads them on start up, builds
appropriate APIs
Source Tools Toolset
Cloud
SQL
AlloyDB
Postgres
interacts with
one of
postgres-sql
( user
defined )

Define sources
sources:
# This tool kind has some requirements.
# See
https://github.com/googleapis/genai-toolbox/blob/main/docs/sources/cloud-sql-pg.md#require
ments
my-cloud-sql-source:
kind: cloud-sql-postgres
project: my-project-name
region: us-central1
instance: my-instance-name
user: my-user password:
my-password database:
my_db
Sources represent a source of data that a tool can use. Typically they encapsulate _how_ a
database is connected to – IP address, credentials, etc.

Define Tools
tools:
get_flight_by_id:
kind: postgres-sql
source: my-cloud-sql-source
description: >
Use this tool to list all airports matching search criteria. Takes
at least one of country, city, name, or all and returns all matching
airports. The agent can decide to return the results directly to
the user.
statement: "SELECT * FROM flights WHERE id = $1"
parameters:
- name: id
type: int
description: 'id' represents the unique ID for each flight.
Tools are an action, typically executed on a source. These include description of how/when to
take the action, and things like which parameters are specified.

Define Toolsets
toolsets:
my_first_toolset:
- my_first_tool
- my_second_tool
my_second_toolset:
- my_second_tool
- my_third_tool
Tool sets are logical groups of tools. You can load all tools or tools by toolset to pass into an
agent.
# This will load all tools
all_tools = await client.load_toolset()
# This will only load the tools listed in 'my_second_toolset'
my_second_toolset = await client.load_toolset("my_second_toolset")

AI TRACK - Mirakram Aghalarov &
Zahra Bayramli, Nishi Ajmera,
Kamran Huseynov & Tarlan
Huseynov,Nikhilesh Tayal , Yoyu Li

XXXXX
XXXXX
XXXXX
XXX
XX
XX
XX
XX
XX
XX
XX
XX
XX
XX
XXXXX
XXXXX
XXXXX
XXXXX
XXXXX
X
WHEN LLM STAYS HOME:
BUILDING KNOWLEDGE
SYSTEMS ON PREMISES
Mirakram Aghalarov
Zahra Bayramli
</
/>

</SPEAKERS
Mirakram Aghalarov Zahra Bayramli
Senior Deep Learning Engineer @SOCAR
CIC
Lecturer of AI&ML courses at BHOS
MSc Data Science and Engineering Graduate
from Politecnico Di Torino
Deep Learning Engineer @SOCAR CIC
BSc Computer Science Graduate from
Korean Advanced Institute of Science &
Technology (KAIST)

//Table of contents
{02}
{01}
{03}
Cloud Infrastructure
On-Premise Infrastructure
Hybrid Solution

</AI Start Ups Worldwide
2024 2025
~92000 ~212000

Delivery Hero SE is a German multinational online food ordering
and food delivery company based in Berlin, Germany.
REI is a member-owned retailer that sells outdoor gear,
promotes sustainable adventure and customer specific design
using GenAI.
</Distribution over Cloud

The AI developer platform to build AI agents, applications,
and models with confidence
Accelerating AI development and deployment with a secure
collaboration platform for AI developers and data providers

Deepset is an enterprise software vendor that provides
developers with the tools to build production-ready Artificial
Intelligence and natural language processing systems.
Robin AI is a legal-tech company that provides an AI-powered
platform to review, analyze, and manage contracts far more
quickly and securely.

</How to build simple Chatbot
Vertex AI Studio
Vertex AI Search
Agent Garden
Ray

GCP provides easy to use pipeline
builder to establish the flow.
To build simple RAG agent Vertex
studio helps to integrate the chatbot
with Vertex AI Search capabilities
based on selected Vector Database

AI Document Intelligence
AI Content Understanding
OpenAI Model Registry
Azure Machine Learning

Amazon Sagemaker
Amazon Bedrock
Amazon Lambda
AWS S3
Amazon Kendra
Amazon API Gateway

Amazon Sagemaker Amazon Bedrock
Amazon Lambda AWS S3
Amazon Kendra
Amazon API Gateway

</Agentic AI
My dear, Tell me about weather in our 4th plant
What a marvellous question! I am going to check
what is the location of “our 4th plant” …
Okay! Now location of plant with ID 4 is known.
SQL call to the database

</Agentic AI
My dear, Tell me about weather in our 4th plant
What a marvellous question! I am going to check
what is the location of “our 4th plant” …
Okay! Now location of plant with ID 4 is known.
SQL call to the database
Now lets look at the search on services about the
weather information
Answer is 26 Degree Celcius
Searching through the Web

</ReAct Agent
Query
Answer
Tools
Thought

</Cloud Evaluation
Very Low Downtime
High Scalability
Mature Solutions
Low Code Environment
Faster Deployment
No CAPEX
Higher OPEX
Network Bandwidth Limitation
Uncontrollable Cost
Not Suitable for Real time
Limited Privacy

</Azerbaijan Situation
Limited Privacy
Azerbaijani Legislation does not
allow private information to be
used in cloud infrastructure due
to location of them.
Cyber threats, privacy insurance
does not let any sensitive
information to be sent to outside
border of Azerbaijan

</Azerbaijan Situation
Limited Privacy
Higher time for MVP
Higher requirements to
accomplish the task
Less number of startups

Data Privacy and Compliance
- Sensitive documents, internal systems
</Need for On-prem
Restricted Environments
- Factories, labs, government facilities
with no external internet access
Cost
- Cloud APIs become expensive with higher
usage
Full Customization and Control
- Ability to retrain, fine-tune, or modify
models without vendor limits

</RAG Architecture Example
Document Ingestion Embedding Generation Vector Database
Vector Representation
Text/Table
Extraction
Chunking
Embedding Model
Stores embeddings and allow fast
similarity search
0.2 0.7 0.5 …
…
…
Optical Character
Recognition

</RAG Architecture Example
Retriever LLM (on-prem) Application Layer
Vector Search
Hybrid Search
Reranker
Prompt + Retrieved Chunks UI/Chat Interface
FastAPI/Tools
Output

</On-premise Infrastructure
OCR
DB
LLM
Tools

OCR
DB
LLM
Tools
GPU
Data
Center

OCR
DB
LLM
Tools
GPU
Data
Center
Some Cloud
in
Azerbaijan

</Challenges with Open-Source OCR
Complex Tables
Merged cells, nested tables
VLMs Offline
Large, slow, proprietary
Handwritten documents
Cloud APIs > open source
Azure AI Document
Intelligence
Google Vision API
- High VRAM Requirements
- No-fine tuned OCR pipelines
- Latency too high without GPU clusters

</Vector Database
Core Components of
On-Prem Retrieval
Local Vector Database

</Vector Database
Core Components of
On-Prem Retrieval
Local Vector Database Embedding Model
bge-m3 E5-large
Nomic-embed
Jina v2
MiniLM
Instructor-xl

</Vector Database
Core Components of
On-Prem Retrieval
Local Vector Database Embedding Model
Reranker
bge-m3 E5-large
Nomic-embed
Jina v2
MiniLM
Instructor-xl
bge-reranker-large
colbert

</Vector Database
Capabilities Missing
Hybrid Search
- Cloud versions tune this automatically; on-prem
requires manual scoring & fusion
Premium Features
- E.g. Reciprocal Rank Fusion (RRF) in ES not
available on-premise; requires custom implementation
Enterprise-Grade Monitoring & Analysis
- Cloud dashboards show slow queries, index health,
drift detection etc.

</Challenges with Open-Source Models
Sensitivity to Noisy Input
- DeepSeek/Qwen degrade more on messy documents Mtng w/ team @ 4pm
pls updte repprt
Q3 target = 1O0K ???
ask Sam re: buget
Meeting with the team at 4pm. Update the report. Q3
target is 100K. Ask Sam about the budget
The meeting is about 4pm and a report, possibly
about budget. Not sure what target means.
Llama Qwen

</Challenges with Open-Source Models
Sensitivity to Noisy Input
- DeepSeek/Qwen degrade more on messy documents
Context Size Expansion
- Big documents ⇨ huge prompts ⇨ slower inference
Quality & Alignment
- More hallucination, low quality answers

</Agentic AI On-Prem
Model Context Protocol (MCP)
- On-prem LLMs often not aligned for tool calls
MCP HOST
MCP Server A
MCP Server B
MCP Server C
MCP Protocol
MCP Protocol
MCP Protocol
Local Files Storage
External APIs & Apps
Remote Database

Unreliable Function Calling
- No built-in “tool calling” alignment like ChatGPT or Gemini
ReACT Agents do not work
- Lose state across steps ⇨ repeat wrong calls. Produce inconsistent
“thought/action” formatting. Hallucinate tool names or steps

Knowledge Graph Integration
Entity Extraction
Database
Graph DB
Entities and
relationships
Agent
Graph Query Generation Output
LLM Result
User Query
Graph DB

Knowledge Graph Integration
- LLM extracts structured
entities and relations using
long rule-based prompts
- Recursive multi-pass
extraction across chunks

</GPU infra: Inference & Deployment
TensorRT-LLM
- Provides 2-4x faster throughput compared raw pytorch
- Provides better parallelism
- Supports FP8/INT4 quantization to reduce memory usage
- Requires custom engine building based on GPU model

</GPU infra: Inference & Deployment
vLLM / TGI
- High-throughput distributed serving
- Good batching & streaming performance
- Supports HF models & OpenAPI standards
- Requires full GPU stack
Ollama / llama.cpp
- Easy setup, simple single-node runtime
- Supports GGUF quantized models, small memory footprint
- Limited scaling & lower token throughput

</On Premise Evaluation
High Privacy Limited Scalability
High CAPEX
High Downtime
Higher Development Time
Faster Communication
Real-time Streaming
Full-control over Cost
More Customization

</Best of 2 Worlds
High Privacy
Faster Communication
Real-time Streaming
Full-control over Cost
More Customization
Very Low Downtime
High Scalability
Mature Solutions
Low Code Environment
Faster Deployment

Hybrid is Possible Choice
Video Stream
Text Stream
Data Stream
Clean my code! Now!
Blurring
Masking
Anonimization

Blurring
Masking
Anonimization
LLM
Cloud
Tool 1 Tool 2 Tool 3
Demasking

Tool 3
Demasking LLM Answer

//Conclusion
It is not easy to keep the pace of the development in Azerbaijan
compared to the Worldwide scale. Only way out is to understand the
reasons behind limitations and find bypass to eliminate the
disadvantages of both sides

CREDITS: This presentation
template was created by
Slidesgo, and includes icons
by Flaticon, and infographics
& images by Freepik
//Thanks,
Do you have any questions?

Using MCP, A2A and ADK
Building
Conversational
Agents

Nishi Ajmera
Solutions Architect
Publicis Sapient

A software entity designed to act
autonomously to achieve specific goals
Performs tasks, interacts with users, utilizes
external tools
Goes beyond simple input/output – they can
reason, plan, and orchestrate
What is an AI Agent ?

AI systems that use natural language to interact with users and
complete tasks
What are Conversational Agents ?
User
Coding Assistant
Assistant (Siri, Alexa)
Support Bot

Modularity: Break down complex problems into
smaller, manageable agent tasks
Specialization: Create expert agents for specific
functions (e.g., a "billing agent," a "research agent")
Collaboration: Agents can work together, delegating
tasks and sharing information
Scalability & Maintainability: Easier to update, debug,
and scale individual components.
From simple tasks to complex workflows
Agentic Architectures

Agent Development Kit
(ADK)
A flexible and modular framework for developing and
deploying AI agents.

Key Goals
Make agent development feel like software
development.
Simplify creation, deployment, and
orchestration.
Core Principles
Model-agnostic (Optimized for Gemini, but
supports others via LiteLLM).
Deployment-agnostic (Local, Cloud Run, Agent
Engine).
Compatibility with other frameworks (e.g.,
LangChain, CrewAI).
Agent Development Kit

Model Context Protocol
MCP standardises the way AI models and tools communicate and
share context

Agent to Agent Protocol
Enables seamless communication and coordination between
multiple AI agents

MCP vs A2A
Aspect
MCP (Model Context
Protocol)
A2A (Agent2Agent Protocl)
Purpose Agent ↔ Tools/Resources Agent ↔ Agent Collaboration
Communication Client-Server (function-like) Peer-to-Peer (conversational)
State Stateless (tools as functions) Stateful (task lifecycle)
Best For
Accessing tools, APIs,
databases
Multi-agent coordination

Agent as a Tool ?
When to Use Agent-as-Tool (MCP):
Single orchestrator architecture
Short-to-medium tasks (minutes to ~1 hour)
Need tight control over workflow
Simple request-response patterns
You need deterministic, structured
interactions
When to Use A2A:
Long-running tasks (hours to days)
Peer collaboration and negotiation
Dynamic agent discovery
Multi-vendor ecosystems

MCP ensures agents have the right
context and tools to operate
efficiently, while A2A enables
seamless collaboration. Together,
they create a powerful,
interoperable AI ecosystem

Agentic Workflows with AWS
DevFest 2025
Tarlan Huseynov
DevOps Engineer
AWS Community
Kamran Huseynov
AI Engineer

PHASE1: Chunking & Embedding &
Storing
PHASE 2: Semantic Search / Embedding
Inference + Similarity Search
PHASE 3: Augmented Generation & Response
Semantic Search and Retrieval in RAG Pipeline: embed → search → retrieve →
generate

Strands Agents Solution
Strands Agents – AWS-backed
Production-ready agent workflows,
model driven & composable &
emphasizes simplicity
MCP – Modular tool servers,
scalable & observable
Self-Managed provisioning for
logical stack (ECS + Fargate)

Building With Strands Agents
GitHub

AI Agents
LLMs
+ reasoning,
+ external applications,
+ self-reflectioncapabilities

Do you
need an
AI Agent ?
Give me 10 ideas for my Twitter post on AI
Prepare a report of top AI research papers
Send Sachin a leave request and update my
calendar accordingly.
Book the cheapest flight from Delhi to Dubai
Translate this paragraph from Hindi to English
Write an email requesting leave in a polite tone

AI Search
Agent
How Agent can access
tool

Memory
Memory is
important for
conversations

AI Agents actually don’t have any memory

Memory has to be handled outside LLM

AI Agent
Memory
Memory is important for
conversations

AI Agent with a conversation memory

Self
managing
memory
AI Agent with self-
managing memory

AI Agent with self managing memory

There are different kinds of “Memory”

AI Agent
Evaluation
How to evaluate the
chaos?

Can you identify, what went wrong here?

What could go wrong with AI Agents

Evaluation –Traditional Software vs AI Agents

Suite of LLM Evaluation Methods

Evaluation
Techniques
-Code based evals
-LLM as a judge
-Human annotations

LLM as a judge –Important considerations

Elements to
evaluate
-Tool choice
- Generation
-Path choice

If your Agent’s output is correct,
does the trajectory matter?

ADK
Open foundation for
making
production-ready

Google ADK –available as a Python SDK

Google ADK –Agent Development Kit

Happy to connect on LinkedIn
-Founder at “AI ML etc.”
-AI Instructor at LinkedIn Learning
-Google Developer Expert for AI
-IIT Kharagpur alumnus

Creative Technology Director, Infinite Whys
Baku
Beyond the Prompt
AnAnatomy ofan AI-Powered
GameUsing Agent Development
Kit
Yoyu Li
2025

What we are going to talk about
●
●
●
●
●
what ADK is, and why
(spoiler: Agent Development Kit)
a game demo
a simple agent
the multi-agent pattern
a custom agent

01
what is ADK*?
and why?
*Agent Development Kit

LLM + Prompt LLM + Retrieval LLM + Retrieval + Tools
+ Many Tools
+ Reasoning Loop
Agent
Multi Agent
Systems
LLM LLM RAG LLM RAG Tools
LLM
Tools
RAG
An Evolution
AI Agents

Tools
Agent runtime
Orchestration
Agent Definitions
models
Model(s)
Orchestration
Execute the stepsof the LLM
Functions
APIs
Databases
Query
Model-based
reasoning/planning and task
execution loop
Profile, goals, instructions, tools, …
Response
derived plan, to accomplish
given tasks. This includes tool
invocations and maintenance
of intermediate state.
Usedtoreason over goals,
determine the plan and
generate a response
Tools
Fetch data, perform actions or
transactions by calling other
APIs or services
Memory
Short-term Long-term
(AnAgent can
usemultiple models)
GenerativeAI
Keycomponents
AI Agents reason,plan,and
executetasks for users
End user

Baku
Is there something more
?
personal
ADK helps us build a good
architecture for AI-powered
applications. Having specialised
agents means we can
potentially use smaller/more
efficient AI, or even no AI. And
we should onlyuse Gen AI
when it makes sense to do so.

02
Demo time
in Python
A simplegamebuilt

(this is for demo only, therefore not a very sophisticated application)

03
An anatomy
What is under the hood

question agents folder
validation agent folder
file structure
main.py
ui_components.py
game_controller.py
.env file
adk_runners.py
test_runners.py
agent.py
agent.py
display layer
control layer
ADK runners
& sessions
Simple Agent
Multi Agents
Git repo: https://github.com/yoyu777/adk-game-demo-public

04
a simple LLM agent
to validate the user
input

create
the Agent
creating the agent using
ADK command line interface
> adk create [agent_name]

root_agent =LlmAgent(
from pydanticimportBaseModel, Field
from google.adk.agents import LlmAgent
class ValidationOutput(BaseModel):
is_valid: bool=Field(..., description="Indicates if the input is a valid object for the game.")
reason: str=Field(..., description="Explanation of why the input is valid or not.")
model="gemini-2.5-flash", # Or your preferred Gemini model instruction="You are incharge of validate user's
initial input", description="""The user is going to play a game of 20 Questions. Before starting the game,
you need to validate the user's input to ensure it is a valid object for the game. A valid object should be a
noun like the name of an object, an animal, or a concept, and not too obscure. For example, "cat", "car",
"apple" are valid, but "quantum entanglement" or "the number seven" are not. """,
output_schema=ValidationOutput
)
name="validation_agent",
https://github.com/yoyu777/adk-game-demo-public/blob/main/validation_agent/agent.py

debug
the Agent
debugging the agent
in browser
debugging the agent using
ADK command line interface
> adk web
> adk run [agent_name]

question agents folder
validation agent folder
Call the agent in the game
main.py
ui_components.py
game_controller.py
.env file
adk_runners.py
test_runners.py
agent.py
agent.py
display layer
control layer
ADK runners
& sessions
Simple Agent
Multi Agents

Runners, Sessions & Agents
Read more about ADK Runtime: https://google.github.io/adk-docs/runtime/

self.validation_agent_runner=Runner(
agent=validation_agent,
session_service=session_service,
app_name=app_name
)
logger.info("Validation agent initialized")
return
async def initialise_validation_agent(self):
session_service =InMemorySessionService()
app_name="ValidationAgent"
self.validation_agent_session=await session_service.create_session(
app_name=app_name,
user_id="test_user"
)
https://github.com/yoyu777/adk-game-demo-public/blob/main/adk_runners.py

if(event.is_final_response()):
logger.debug(event.content.parts[0].text)
return json.loads(event.content.parts[0].text)
else:
pass
except Exception as e:
logger.error(f"Error during validation: {e}")
return None
async def validate_input(self,user_input="elephant"):
try:
asyncforevent inself.validation_agent_runner.run_async(
user_id="test_user",
session_id=self.validation_agent_session.id,
new_message=types.Content(role='user', parts=[types.Part(text=user_input)])
):
https://github.com/yoyu777/adk-game-demo-public/blob/main/adk_runners.py

the guessing agents
LLM agent
guess agent
custom agent
orchestrator
LLM agent
question agent
each new round
make a guess
produces answer
if not confident

)
# Guessing Agent -Responsible for making final guesses
guessing_agent =Agent(
name="guessing_agent",
model="gemini-2.5-flash",
instruction="You are an expert at making educated guesses in 20 Questions game",
description="""You analyze all the information gathered from previous questions and answers
to make the best possible guess about what the user is thinking of.
""",
output_schema=GuessOutput,
output_key="guess_output"
classGuessOutput(BaseModel):
guess: str=Field(..., description="The final guess for what the user is thinking of")
confidence: int=Field(..., description="Confidence level (1-10) in this guess")
reasoning: str=Field(..., description="Explanation of why this is the best guess. Summarise in
less than 20 words.")
this ishow you passdata between
agents
https://github.com/yoyu777/adk-game-demo-public/blob/main/question_agents/agent.py

)
output_schema=QuestionOutput,
output_key="question_output"
# Asking Agent -Responsible for generating strategic questions
asking_agent =Agent(
name="asking_agent",
model="gemini-2.5-flash",
instruction="You are an expert at asking strategic yes/no questions in 20 Questions game",
description="""You specialize in asking the most effective yes/no questions to narrow down
possibilities.
Your goal is to eliminate as many possibilities as possible with each question.
Consider categories like:
…
""",
classQuestionOutput(BaseModel):
question: str=Field(..., description="A strategic yes/no question to ask")
reasoning: str=Field(..., description="Explanation of why you ask this question. Summarise in less
than 20 words.")

06
Custom agent
for orchestration

)
(custom logic)
yield event
class RootAgent(BaseAgent):
guessing_agent:Agent
asking_agent:Agent
def __init__(self,name:str,guessing_agent:Agent,asking_agent:Agent):
async def _run_async_impl(
self,ctx: InvocationContext
)-> AsyncGenerator[Event, None]:
super().__init__(
name=name,
guessing_agent=guessing_agent,
asking_agent=asking_agent,
sub_agents=[guessing_agent,asking_agent]
generating a series of events
overriding implementation
extending BaseAgent class

asyncforevent inself.asking_agent.run_async(ctx):
yield event
asyncforevent inself.guessing_agent.run_async(ctx):
yield event
ifconfidence >=9:
logger.info("High confidence guess, proceeding to make guess")
yield self.create_text_response_event(dumps({
"action": "make_guess",
"guess": guess_output.get("guess"),
"reasoning": guess_output.get("reasoning")
}), invocation_id=invocation_id)
return
guess_output =ctx.session.state.get("guess_output", None)
confidence =guess_output.get("confidence") ifguess_output elseNone
calling the asking agent
deterministic logic
accessing session state
calling the guessing agent

●ADK helps you build applications
●important concepts: Runners,
Sessions, Agents
●simple agent, multi-agent
& custom agent
Baku
we didn’t cover tools, workflow agents,using
custom models, which are also interesting.
Here is an excellent learning resource:
https://codelabs.developers.google.com/onramp
/instructions
https://codelabs.developers.google.com/onramp/instructions
https://github.com/yoyu777/adk-game-demo-public

Workshop/Data/ML/Cloud/Mobile -
Alper Sari, Javid Aliyev

Deploying
resources via
Gemini CLI

gcloud
IAC REST
Resource Creation
IAP

cloud-shell
gemini-cli (available in Cloud Shell)
gcloud mcp
cloud-run mcp
Prerequisites

Part - 1
Claim the GCP Credit
Link with billing account and project
Install Gemini CLI’s gcloud MCP
Visit Prompt Generator
Part - 2
Create a VPC Network
Create a Subnet
Create a VM Instance and Install
NGINX
Check the server
Overview

Credit Link : trygcp.dev/claim/devfest-baku
Repo : github.com/alper-sari/geminicli-cloud-shell-tutorial
Links

WebRTC
Javid Aliyev - @thinkingIT
Software Engineer
Catch the Bug Before It Blinks
Pro Testing for Front-End Devs

Global Companies I worked:
Processica (Branch of AWS)
Cymulate (Israel)

Benefits of front-end testing
@GDG
Identifies bugs
Ensures Consistency
Cross-browser/device compatibility
Faster Development cycle
Scope for third-party integration

@GDG
Differences between be and fe testing
Backend Testing
Focuses on functionality of
the server and database
Ensures performant API’s
Does not require a browser
Frontend Testing
Focuses on interaction
between user and soft
Does not require a database
May require a browser

@GDG
Unit testing:
Jest
Vitest
Mocha
Tools for each test phases
Performance Testing
Lighthouse (Chrome DevTools)
WebPageTest
K6
End-to-End (E2E) Testing
Playwright
Cypress
Selenium
WebdriverIO
Integration Testing
Vitest (component + store)
Jest
React Testing Library
Cross-Browser Testing
Playwright (Chromium, Firefox,
WebKit)
BrowserStack
Sauce Labs
Accessibility Testing
axe-core (industry standard)
jest-axe
Lighthouse Accessibility Audit
Pa11y

@GDG
Visual Regression Testing:
Playwright Snapshots
Chromatic (for Storybook)
Percy
Applitools Eyes
Loki
Tools for each test phases
Acceptance Testing
Cypress (business-flow
validation)
Playwright
Testim / QA Wolf (automated
acceptance frameworks)

Linkedin @javid aliyev
Telegram @alyevv
Github @cavid-aliyev
Questions?

DevFest Baku 2025 Speaker Presentations ( Software, AI, Workshop tracks)

More Related Content

Similar to DevFest Baku 2025 Speaker Presentations ( Software, AI, Workshop tracks)

Recently uploaded

DevFest Baku 2025 Speaker Presentations ( Software, AI, Workshop tracks)