[BDD 2025 - Artificial Intelligence] Building AI Systems That Users (and Companies) Love. (Mochamad Rafy Ardhanie)

Building AI Systems
That Users (and
Companies) Love
Mochamad Rafy Ardhanie | Ex
Curriculum Developer Dicoding

Education:
Mochamad
Rafy Ardhanie
Work Experiences:
● Engineer, Dicoding
● Curriculum Developer,
Dicoding
● Indonesia Computer
University

The world won't wait for us to take action

Issue at Dicoding
Student Needs
Company Needs

What is Love?
What do users "love"?
Users crave a seamless experience.
They expect instant responses (low
latency), accurate, relevant, and reliable
answers (no hallucinations), and
interactions that feel personalized and
genuinely helpful.
What do companies "love"?
Companies operate based on metrics.
They demand operational efficiency, clear
and sustainable return on investment
(ROI), predictable and manageable costs,
sustainable competitive advantage, and,
most importantly, mitigation of legal and
reputational risks.

Pragmatic Solution
A pragmatic solution is entirely
focused on solving a specific,
existing real-world problem
efficiently and effectively.
Exploratory-Driven
Prioritizes the exploration of novel
technologies and cutting-edge
capabilities to create new
possibilities, often before a
specific market need is defined.
What Can We Do?
Agnostic
Solution-focused rather than
tool-loyal, refusing to be tied to
a single framework , cloud
vendor, or specific model
architecture.

Not all of our problems
“must be solved” with
Generative AI.
AI must provide clear Beneﬁt, be justiﬁed against
their Cost, and present manageable Risk.

Dicoding AI Approach
Proprietary
Using paid,
high-performance "black
box" models via an API.
Open-Source
Using free, adaptable
models that we can
customize and host by
yourself.
Hybrid
Strategically mixing both
proprietary and open-source
models to balance cost and
capability.

Implication
Operational
We achieved a 74.98%
improvement in
operational efficiency.
Average Man
Hour
We freed up 80.9% of our
team's time for more
critical tasks.
Perusahaan X
Average productivity increase of 14%
(up to 34% for novice workers).
— Generative AI at Work (Working Paper 31161)
Perusahaan Y
Saved 12,000 hours of work in 18
months.
— YYY Case Study: Transforming HR with AI (AskHR)

As company move from tinkering to
deploying models in production, we’re face
three+one main concerns:
1. Cost: Signiﬁcant for compute-intensive
AI applications.
2. Quality & Performance: Critical for AI
applications.
3. Security: Important for data residency
and preventing third-party models
from ingesting private data.
4. Tech Updates
Ofc, we are facing
several problems

1. Operational Cost — API
So, one of the services that uses GPT-4.1
costs around $4.54

1. Operational Cost — API
ScaleDown - Substack

1. Operational Cost — Self-host
analytics_vidhya

2. Quality & Performance
Open-source vs. Proprietary Models - by Chris Zeoli
Quality benchmarks: Measure how well a model answers
questions, reasons, or follows instructions.
Performance benchmarks: Measure how fast and efficiently a
model runs in real-world environments.
Resource

3. Security
SaaS (Proprietary): Models primarily centers on data
privacy and vendor trust, as sensitive company data is
transmitted to and stored by a third party, raising
concerns about potential breaches, compliance with
data regulations, and how the vendor utilizes your
inputs.
Self-Hosted: Security responsibility lies entirely with
your internal infrastructure and code integrity,
demanding robust protection for servers, networks, and
APIs, along with vigilance against vulnerabilities in
open-source components and the theft of your
customized models.

4. The Tech is Still Rapidly Evolving

Everything is
Easy*
— once it's done.
– But when will it be ﬁnished? :p
02

1. Sliding Tackle — Cost
SaaS vs On-Premise: Making Informed Software Decisions
The on-prem (self-hosted) approach
provides maximum control and data
security by keeping models in-house, but it
is expensive and difficult to scale for
"bursty" traffic.
Conversely, the SaaS (proprietary)
solution offers effortless scalability and
access to state-of-the-art models but
requires sacriﬁcing data control and
trusting a third-party vendor.

1. Sliding Tackle — Cost
1. Keep the Open Source Models as long as
they solve your baseline business problems.
2. Escalate to Proprietary APIs for
State-of-the-Art (SOTA) capabilities when
your OS models hit their performance or
reasoning ceiling.
3. Use the Hybrid Approach to get the best of
both worlds: use self-hosted for
high-volume/low-cost tasks and tap into APIs
for high-complexity/low-volume tasks,
perfectly balancing cost and capability.
Amazon Science - How task decomposition and smaller LLMs can
make AI more affordable

2. Sliding Tackle — Performance Issues
Pick the simplest tool that meets today’s needs, with headroom for
tomorrow. Start on a workstation (Ollama/LM Studio), move to a GPU
server (vLLM/SGLang), and standardize with Triton when you’re ready
to run many models.
1. Don't worry about SaaS as long as you
have internet access, money and they're not
down.
2. Self Host: Consider using smaller models.
3. The hybrid approach bridges this gap by
using secure on-prem systems for sensitive,
baseline workloads while "overﬂowing" to
the cloud to manage peak demand,
strategically balancing cost, control, and
elasticity.

Amazon Science - How task decomposition and smaller LLMs can
make AI more affordable
LLM Locust: A Tool for Benchmarking LLM Performance or you can
use genaiperf, etc

Sometimes, smaller is better, at least
for performance.

2. Sliding Tackle — Quality Issues
A language
model is simply a computational
system that can predict the next word
from previous words.
— Speech and Language Processing 3rd Edition, Large Language
Models, Dan Jurafsky and James H. Martin.
Accuracy does not scale linearly with size; SLMs often match LLMs on
structured or narrow tasks, while LLMs consistently outperform on
complex reasoning.
Quantization Pruning
Distillation LoRA
Building SLM from Scratch

A Guide to Context Engineering for PMs
Large pre-trained language models
have been shown to store factual
knowledge in their parameters...
However, their ability to access and
precisely manipulate this knowledge is
limited, and hence they lag behind
task-speciﬁc architectures.
— (Lewis et al., 2021)

Faithfulness: Does the model's answer truly come
from the given context (to prevent hallucinations)?
Answer Relevance: Does the model's answer truly
answer the user's question?
Coherence: Are the sentences coherent and logical?
Safety/Toxicity: Is there any harmful, biased, or
policy-violating output?
Human Evaluator

3. Sliding Tackle — AI Evolution
"Run, don't walk. Either you are running for
food, or you are running from being food."
— Jensen Huang, May 26, 2023

"Deploying a system is not the end. It’s the
beginning. Once a system is deployed, it
interacts with the real world... and the real
world changes."
— Chip Huyen, Designing Machine Learning Systems: An Iterative
Process for Production-Ready Applications
We Never Finish the Projects — Not yet

The "Tax" of Independence (Reality Check)
Hardware Requirements: GPU availability and
VRAM management for latest models
Engineering Overhead: MLOps and
collaboration skills :D
Responsibility: If the server goes down, you are
the support team

A Survival Guide for Developer Myself
Abstraction Layers: Never hardcode a model. Use
agnostic interfaces (e.g., AI SDK, LiteLLM) to swap
backends instantly.
Evaluation Driven Development (EDD): Trust your test
suite, not the hype. Run 'evals' to verify if a new model
actually improves your speciﬁc use case.
Dynamic Routing: Don't use a cannon to kill a mosquito.
Route simple tasks to fast/local models and complex
logic to SOTA models.
"The goal isn't to pick the best model forever, but to build a system that can adapt to the
best model of the month."

Thank You
rafyardhani
rafy rafyardhanie
Get in touch
rafy@dicoding.com

[BDD 2025 - Artificial Intelligence] Building AI Systems That Users (and Companies) Love. (Mochamad Rafy Ardhanie)

More Related Content

Similar to [BDD 2025 - Artificial Intelligence] Building AI Systems That Users (and Companies) Love. (Mochamad Rafy Ardhanie)

More from Wintari Yasmin

Recently uploaded

[BDD 2025 - Artificial Intelligence] Building AI Systems That Users (and Companies) Love. (Mochamad Rafy Ardhanie)