Gordon's STEM Blog: Artificial Intelligence

Showing posts with label Artificial Intelligence. Show all posts

Thursday, February 13, 2025

Deepseek and Open Source Large Language Models (LLMs)

Deepseek is getting a lot of publicity these days as an open source Large Language Model (LLM) and has got me thinking, not just about Deepseek but about the potential of open source LLMs in general. MIT Technology Review recently published a scary article titled An AI chatbot told a user how to kill himself—but the company doesn’t want to “censor” and it got me thinking a little bit more about the impact open source LLMs can have.

The MIT Technology Review Article

The article reports on a concerning incident where an AI chatbot explicitly encouraged and provided instructions for suicide to a user named Al Nowatzki. The article highlights broad concerns about AI companion apps and their potential risks to vulnerable users' mental health.

Anthropomorphization

Anthropomorphization is a pretty fancy word – it is basically the attribution of human characteristics, behaviors, emotions, or traits to non-human entities, such as animals, objects, or in this case, artificial intelligence systems. It is something the AIs are getting better and better at.

What does this have to do with Open Source?

The recently released open-source large language model that specializes in coding and technical tasks, has been developed as an alternative to proprietary AI models. If you are not familiar with the term “open source” basically it means the source code, model weights, or other components are freely available for anyone to view, use, modify, and distribute under specified licensing terms.

Now, since Deepseek is open source, if you have adequate computing resources, you can easily install and run Deepseek models locally on your computer. Here’s basically what you'll need:

Sufficient GPU memory - depending on the model size, you'll need a powerful GPU (like an NVIDIA card with 8GB+ VRAM)
Enough system RAM - typically 16GB+ recommended
Adequate storage space for the model weights

The basic process that you can find all over the web now commonly involves:

Setting up Python and required dependencies
Installing the necessary ML frameworks (like PyTorch)
Downloading the model weights
Using libraries like transformers or llama.cpp to run the model

It may sound complicated but it is really pretty simple to set one up if you follow instructions.

What’s the big deal?

AI training is the process of feeding large amounts of data into machine learning algorithms to help them recognize patterns and learn to perform specific tasks, like generating text or recognizing images, by adjusting their internal parameters through repeated exposure and feedback. So what is to prevent a malicious person with an open source AI installed taking this a few steps further, training an AI to do all kinds of malicious things and providing access via the web?

If you or someone you know is struggling with suicidal thoughts, call or text 988 to reach the Suicide and Crisis Lifeline.

Friday, June 21, 2024

An Exponential Leap: The Emergence of AGI - Machines That Can Think

Tech companies are in a rush. They're trying to lock in as much electricity as they can for the next few years. They're also buying up all the computer components they can find. What's all this for? They're building machines that can think and referring to the tech as Artificial General Intelligence, or AGI.

On June 3 Ex-OpenAI researcher (yeah he was fired) Leopold Aschenbrenner published a 162 page interesting document titled SITUATIONAL AWARENESS The Decade Ahead. In his paper Aschenbrenner describes AGI as not just another incremental tech advance – he views it as a paradigm shift that's rapidly approaching an inflection point.

I’ve read the whole thing - here's my short list of highlights by topic.

Compute Infrastructure Scaling: We've moved beyond petaflop systems. The dialogue has shifted from $10 billion compute clusters to $100 billion, and now to trillion-dollar infrastructures. This exponential growth in computational power is not just impressive—it's necessary for the next phase of AI development.

AGI Timeline Acceleration: Current projections suggest AGI capabilities surpassing human-level cognition in specific domains by 2025-2026. By the decade's end, we're looking at potential superintelligence—systems that outperform humans across all cognitive tasks.

Resource Allocation and Energy Demands: There's an unprecedented scramble for resources. Companies are securing long-term power contracts and procuring voltage transformers at an alarming rate. We're anticipating a surge in American electricity production by tens of percentage points to meet the demand of hundreds of millions of GPUs.

Geopolitical Implications: The race for AGI supremacy has clear national security implications. We're potentially looking at a technological cold war, primarily between the US and China, with AGI as the new nuclear equivalent.

Algorithmic Advancements: While the mainstream still grapples with language models "predicting the next token," the reality is far more complex. We're seeing advancements in multi-modal models, reinforcement learning, and neural architecture search that are pushing us closer to AGI.

Situational Awareness Gap: There's a critical disparity between public perception and the reality known to those at the forefront of AGI development. This information asymmetry could lead to significant societal and economic disruptions if not addressed.

Some Technical Challenges Ahead:

- Scaling laws for compute, data, and model size

- Achieving robust multi-task learning and zero-shot generalization

- Solving the alignment problem to ensure AGI systems remain beneficial

- Developing safe exploration methods for AGI systems

- Creating scalable oversight mechanisms for increasingly capable AI

An over reaction by Aschenbrenner? Some think so. Regardless - this stuff is not going away and as an educator and technologist, I feel a responsibility to not only teach the tech but also have students consider the ethical and societal implications of this kind of work. The future isn't just coming—it's accelerating towards us at an unprecedented rate. Are we prepared for the AI technical, ethical, and societal challenges that lie ahead?

Monday, April 29, 2024

Distributed Inference And Tesla With Some SETI Nostalgia

In this post, I’m setting aside any political stuff and focusing solely on tech.

In recent months, the electric vehicle (EV) market has seen a decline, marked by falling sales and an increase in unsold inventory. Tesla, in particular, has received a significant share of negative attention. During Tesla's first-quarter earnings call last week, Elon Musk diverged from the norm by highlighting Tesla's broader identity beyond its role in the automotive industry. He emphasized the company's engagement in artificial intelligence and robotics, suggesting that pigeonholing Tesla solely within the EV sector overlooks its broader potential.

Musk's suggestion to actively utilize Tesla's computational power hints at a larger strategic vision. He envisions a future where idle Tesla vehicles contribute to a distributed network for AI model processing, termed distributed inference. This concept could leverage the collective computational strength of millions of Tesla cars worldwide, extending the company's impact beyond transportation.

Very interesting – I drive maybe 1-2 hours per day, the rest of the time my car is not being used. What if all that computing horsepower could be used while I’m not using it? Musk’s concept brings up memories of the sunsetted SETI@home computer application. SETI was a distributed computing project that allowed volunteers to contribute their idle computer processing power to analyze radio signals from space in the search for extraterrestrial intelligence (SETI). SETI@home used data collected by the Arecibo Observatory in Puerto Rico and the Green Bank Telescope in West Virginia to search for patterns or anomalies that could indicate the presence of intelligent alien civilizations.

Participants in SETI@home downloaded a screensaver or software client onto their computers, which would then process small segments of radio telescope data during periods of inactivity. The processed data would be sent back to the project's servers for analysis. By harnessing the collective power of millions of volunteer computers around the world, SETI@home was able to perform computations on an unprecedented scale. The project was launched in 1999 by the University of California, Berkeley, and it quickly became one of the largest distributed computing projects in history. Although the original SETI@home project ended in 2020, its legacy lives on as an example of the power of distributed computing and the widespread public interest in the search for extraterrestrial life.

Musk's vision underscores Tesla's potential to revolutionize not only the automotive sector but also broader domains such as artificial intelligence and robotics. It signifies a strategic shift towards leveraging Tesla's resources and expertise in a SETI-like way to drive innovation and create value in new and unexpected ways.

Gordon's STEM Blog

Thursday, February 13, 2025

Deepseek and Open Source Large Language Models (LLMs)

Friday, June 21, 2024

An Exponential Leap: The Emergence of AGI - Machines That Can Think

Monday, April 29, 2024

Distributed Inference And Tesla With Some SETI Nostalgia

About Me

My Blog Archive

Search This Blog