Alpine Linux Docker images have NULL for root password

akvadrako · on May 8, 2019

Is this a risk out of the box?

    $ docker run -it -u guest alpine 
    / $ su
    su: must be suid to work properly
    / $ login
    login: must be suid to work properly
    / $ find / -perm /4000 -print
    find: /root: Permission denied
    find: /proc/tty/driver: Permission denied
    / $

justincormack · on May 8, 2019

No, it is not an issue, there are no suid binaries at all in the base install, so no way of changing user.

The link from the CVE says "Due to the nature of this issue, systems deployed using affected versions of the Alpine Linux container that utilize Linux PAM, or some other mechanism that uses the system shadow file as an authentication database, may accept a NULL password for the root user." - so the config would have to install PAM with suid binaries, and configure shadow passwords and not change the password. This is pretty unlikely overall.

johnisgood · on May 9, 2019

Docker is a risk "out of the box". It requires root to run the daemon, and to use it you either need sudo or belong to the docker group which is equivalent to root access.

On their website they claim docker is "quite secure" if you "run your processes as non-privileged users inside the container", but I do not see how this would be the case given that breakouts happen all the time, AND again, it requires root to run, which gives me the opposite vibe of secure.

Docker is for convenience and it is a security risk.

spacenick88 · on May 9, 2019

Docker also has support for user namespaces which makes it so that the root inside the container is e.g. nobody outside. In general I'd say docker (with user namespaces) is quite a bit more secure than running as a normal user without containers when it comes to escaping the container/the app being breached. However this is very distinct from the fact that it basically ignores all multi user aspects of Unix when it comes to using the docker command. So while the application inside is more secure, using docker is insecure. Now in the typical deployment scenario the latter doesn't really matter since an admin deploys and the developer only creates the image locally.

raesene9 · on May 9, 2019

If you are running as a non-privileged user inside a container, which breakouts do you think can be used to escalate privileges to the hosting machine?

All a container is, is a linux process with added security constraints. So, if you're talking about an escalation from an unprivileged process inside a container, you need a standard Linux privesc attack (nothing to do with Docker) and you also need to bypass the isolation mechanisms on that process (namespaces, capabilities, seccomp filter, SELinux/AppArmor)

bloak · on May 9, 2019

It's just about possible for someone to accidentally do something like this, on "host":

  # docker run -it -p 2222:22/tcp alpine
  / # apk add openssh-server
  / # for t in ecdsa ed25519 rsa ; do ssh-keygen -t $t -C root -P '' -f /etc/ssh/ssh_host_${t}_key ; done
  / # echo PermitRootLogin yes >> /etc/ssh/sshd_config
  / # echo PermitEmptyPasswords yes >> /etc/ssh/sshd_config
  / # /usr/sbin/sshd

Then, on another machine:

  ssh -p 2222 root@host

Fnoord · on May 9, 2019

According to the CVSS score its a huge risk. FTA:

> 9.8 - CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H

(Leaving my opinion out of this.)

code_duck · on May 8, 2019

On every distro in the past, i’d do sudo passwd. Always worked. No idea about Alpine.

matthewbauer · on May 8, 2019

But sudo access requires you to have logged in through a user in the wheel group. That at least is not a vulnerability.

code_duck · on May 8, 2019

OK. In my experience it worked with any user account. I would install, create an account, and immediately use sudo to change the root password.

TheDong · on May 8, 2019

The first user account is considered an administrator account on most distros by default, so it has sudo privileges.

If you can do what you said with a non-wheel/sudo account, that would be a serious vulnerability.

code_duck · on May 9, 2019

I see, thanks. I posted hoping people would inform me about this.

ShinTakuya · on May 9, 2019

Alternatively, it's possible the distro added the first user account to wheel also. I believe I've seen that in the past.

Godel_unicode · on May 9, 2019

How do you think that's different than the comment to which you replied?

zaarn · on May 9, 2019

Alpine comes with no suid binaries in Docker to my knowledge (it's expected you run your stuff as root inside the container unless there is a reason not to)

arghwhat · on May 9, 2019

Just for the uninitiated:

suid binaries are binaries with a special flag set that will make it run with root privileges regardless of who started it.

sudo is an example of something that would use suid. When a user runs sudo, the binary actually runs with root privileges from the get-go, checks if the user is OK, then executes the command you specified.

However, use of sudo or other suid binaries is entirely pointless in an alpine container. There being no password also does not matter, as you are by default already running everything as root. Who cares if root can become root?

javagram · on May 9, 2019

> There being no password also does not matter, as you are by default already running everything as root. Who cares if root can become root?

Best practice would have you switch to a non root user before running whatever it is inside the container. Although if you haven’t added any suid binaries by accident then there’s no way to go back.

E.g. the node alpine image adds a “node:node” user and group for the process to run as instead of root. https://github.com/nodejs/docker-node/blob/master/10/alpine/...

zaarn · on May 9, 2019

Well in theory someone could escalate their privileges to the exact same ones they already have! THAT'S TERRIBLE!

code_duck · on May 10, 2019

OK, so what’s the entire point of this article then?

hnthrow0693 · on May 8, 2019

I have always been a bit surprised at the popularity of Alpine Linux for docker images. It’s awesome that the images are pretty small, but a wide variety of software has been shown to run noticeably slower on Alpine compared to other distributions, in part due to its usage of musl instead of glibc. I’d think that a few megabytes of disk isn’t as valuable as the extra cpu cycles.

yegle · on May 8, 2019

For anyone who want a small image but with glibc, https://github.com/GoogleContainerTools/distroless is a good choice, especially if you are writing in static linked language e.g. Go and Rust.

catern · on May 8, 2019

"distroless" is just Debian packages. Their self-description is fairly annoyingly misleading, since they don't mention that they are just using packages from Debian.

yegle · on May 8, 2019

Why is that a bad thing? Binaries are binaries, whether you copied from a deb package or completely built from source code (assuming reproducible build, which Debian supports), they are the same.

catern · on May 8, 2019

It's absolutely not a bad thing, indeed I think it's a good thing to use binaries from Debian. I just think the name of the project is strongly misleading. "distroless" is built from the Debian distribution, not something new made from scratch.

hobofan · on May 8, 2019

Not sure how widespread that view is, but that's never the association I would have made.

"Xless" is just a different way of saying "without X". "fearless" = "without fear"; "serverless" = "without a server" (yes, I know, not really); so "distroless" = "(comes) without a distribution"

So in my opinon the name correctly suggests that it's "just the package" and comes "without a distro", and not "was built without the help of a distribution".

geezerjay · on May 9, 2019

> Binaries are binaries, whether you copied from a deb package or completely built from source code

No, they are not.

Debian packages are deployable packages, which are built around packages and services and conventions adopted and provided by the target distribution.

collinmanderson · on May 9, 2019

it should probably be called "package-manager-less" because there's no package manager in the final build, but there's also no ls, etc, so distroless kinda makes sense. Maybe systemless?

snarfy · on May 8, 2019

For statically linked binaries, why wouldn't you use the SCRATCH (0 kb) 'image'?

alpb · on May 8, 2019

You would. Except most statically linked binaries may still need ca-certificates, tzdata, and some other files that libraries expect it to be present on the system.

Not to mention, you still need runtimes if you are programming in Java, Python or many other languages. https://github.com/GoogleContainerTools/distroless project gives a way to have these runtimes + their lib dependencies while still maintaining a minimal attack surface.

yegle · on May 8, 2019

For almost any serious job running in production, you might need CA certificates and openssl.

sofaofthedamned · on May 8, 2019

Unless you're behind a load balancer which terminates TLS and the traffic you deal with is purely http.

mark242 · on May 8, 2019

Please don't do this anymore. End-to-end encryption is extremely easy to set up and maintain. P2PE will absolutely lull you into a false sense of security.

res0nat0r · on May 8, 2019

Everyone seems to think E2E encryption is needed everywhere (I know because the security guys at work think it is needed everywhere, even for everything inside a VPC), but even AWS here is advertising the fact that you don't need to do this:

https://aws.amazon.com/blogs/aws/new-tls-termination-for-net...

>Today we are simplifying the process of building secure web applications by giving you the ability to make use of TLS (Transport Layer Security) connections that terminate at a Network Load Balancer (you can think of TLS as providing the “S” in HTTPS). This will free your backend servers from the compute-intensive work of encrypting and decrypting all of your traffic, while also giving you a host of other features and benefits:

__jal · on May 8, 2019

If you trust Amazon to know your risk profile better than your security people, you have a management problem of some sort.

res0nat0r · on May 8, 2019

I trust myself who setup our infrastructure vs. the security guys who's automatic response to everything is deny all everywhere, encrypt everything everywhere (at rest encryption isn't enough, what can you do to get the db to work on the data encrypted internally 100% of the time?), and enable 2 factor on everything (the github gui has 2fa enabled, why aren't push/pull requests using 2fa?).

I think may main point is that kneejerk reactions to satisfy a security list checkbox are just as useless as a default "encrypt everything everywhere" stance must be better.

growse · on May 9, 2019

Sounds like you need better security people.

candu · on May 10, 2019

...which, sadly, is true of many larger organizations: "security" ends up as just another middle management approval committee whose only job is to apply byzantine security checklists dreamed up by some Certified Security Architect (tm) way too late in the development process, right when it's hardest for product teams to reshuffle their entire architecture to comply, and with no consideration to the actual circumstances / risk profile of specific projects.

IMHO this should be viewed as a big, glaring anti-pattern, as it fundamentally puts security team goals at odds with product team goals.

res0nat0r · on May 9, 2019

Agreed.

__jal · on May 9, 2019

...Which sounds like a management problem.

res0nat0r · on May 9, 2019

Agreed.

disiplus · on May 8, 2019

but without tls amazon can "decrypt" your traffic and see whats inside. its one thing to have a backdoor inside a server that they rent to you that would have to be actively exploited and another to passively clone the traffic and analyze it in the name of making the service better.

fao_ · on May 9, 2019

If you believe that amazon are potentially an adversary, but you still want to host it on their servers, there is essentially nothing you can do to stop them getting at your data. At some point, to process the data, you have to do that unencrypted. That is an unpluggable achilles heel.

smolder · on May 8, 2019

Why do they have to actively exploit hardware/vms that they own? Isn't it pretty trivial for a hypervisor to "passively clone" data right out of the memory of the VM? Or to use management interfaces/custom peripherals to exfiltrate data if it were bare metal? AWS is kind of a black box to me but it seems hopeless to try to protect data from people that physically control the systems.

disiplus · on May 9, 2019

i was especially talking about passively mirroring/analysing network traffic. afak there is no easy and trivial way to "passively clone" aka dump memory of the hypervisor all the time without it being detectable in slowdowns and so on. my concern was not that i need to protect myself from amazon for the fear that they will hack my server, but to the way that they can get insight into my customers, maybe get a snippet of the data i get and so on.

we once saw this from some other company where they noticed we where talking to the competitors and wanted to talk.

solatic · on May 8, 2019

Last time I checked, mTLS incurred significant performance penalties and required significant soak testing to ensure that performance would be acceptable for a given application. If you're a small company, you have much lower hanging fruit to chase.

jacques_chester · on May 9, 2019

In my understanding there's additional overhead at handshake, but after that the performance is basically identical. The client certificate mostly acts to identify the client to the server, but otherwise the business of picking session keys etc is the same. At this point TLS overhead is close to free.

I think the start of this thread was a plea not to terminate HTTPS at the edge, but instead to plumb it all the way to the serving container. That's unlikely to be mTLS in any case.

crehn · on May 9, 2019

What's an extremely easy solution to set up and maintain automated certificate signing and provisioning?

dserodio · on May 9, 2019

CloudFlare's cfssl: http://blog.cloudflare.com/introducing-cfssl

StreamBright · on May 8, 2019

It does not matter how easy it is, what is the security threat you are mitigating with E2E encryption? In a large scale system it is often not trivial to build a proper E2E, far from being impossible though.

sofaofthedamned · on May 8, 2019

Eh, don't teach my about the systems I run. I'd love to run TLS end to end but in this one? Nah, not worth it.

arcticbull · on May 8, 2019

"Don't teach me about ..." -- aren't we all here to learn? Let's keep the tone civil and assume the best.

baq · on May 8, 2019

I work at $CORP. I don't trust my enterprise IT department with unencrypted traffic for fear of falling victim to stupid traffic shaping or deep packet inspection intrusion prevention going haywire.

sofaofthedamned · on May 8, 2019

Good for you. In my current gig the trade-off is different. I don't work for Google.

mschuster91 · on May 8, 2019

> and the traffic you deal with is purely http.

Which is a truly rare case as many backend APIs these days are mandatory secured by HTTPS (or LDAPS, SMTPS, IMAPS to name a couple other openssl-based secure protocols).

briffle · on May 8, 2019

sure, as long as you don't connect to ANY outside url's for anything

aflag · on May 8, 2019

Which you probably shouldn't be doing for most of your services anyway.

icebraining · on May 8, 2019

I don't remember the last project I did which didn't have some sort of integration with an external service. I guess if you use microservices, most components won't need it; I mostly use monoliths.

aflag · on May 9, 2019

I guess if everything is in a single service you're bound to have some sort of outbound connection at some point. Although, in my experience, you can go very far within your vpc.

Gladdyu · on May 8, 2019

https://amp.businessinsider.com/images/5271388a6bb3f7ac4756d...

alanwreath · on May 8, 2019

I wonder if this would be important with service mesh and mutual tls...

jacques_chester · on May 9, 2019

Service meshes make in-cluster mTLS a more-or-less automatic feature, which is worth having. Some will terminate TLS at ingress and convert to mTLS internally. The argument above is that you shouldn't do this and should instead plumb that ingressing TLS traffic all the way to the container.

The downside of plumbing directly to the container is that you lose many of the routing features of a service mesh. If it can't inspect the traffic, it can't do layer 7 routing. It can only route and shape at layer 4.

Thaxll · on May 9, 2019

This has nothing to do with being a LB, if you need to do outgoing calls with https you most likely need ca-certificates.

jupp0r · on May 8, 2019

With static linking your binary would contain openssl.

matthewbauer · on May 8, 2019

Not the certificates though, unless you do some special tricks. I think GP is probably thinking of kerberos/NSS, which has a plugin system that requires dynamic linking.

yegle · on May 9, 2019

There are two basic flavors of distroless image, one is base, the other one is static.

The `static` image is very bare minimal and only contains things like nsswitch.conf and ca certificates. This is the recommended base image for statically linked languages.

There's also a `base` image that I usually use as the base image of C programs, e.g. stunnel/unbound DNS. In those cases, I usually use Debian Stretch as build environment (distroless uses binary from Debian stable, so ABI is compatible), build a dynamically linked binary, then copy the result binary along with all dynamic object files to a distroless base image.

So when I talk about openssl in the image, I was referring to the `base` flavor. If you are happy with the provided openssl version, the openssl in the base image is indeed useful.

dastx · on May 9, 2019

You can easily use multi stage docker files to copy the certs.

hssys · on May 8, 2019

You might want at least a shell in the container for debugging?

ImJasonH · on May 8, 2019

Distroless has debug images for this purpose: https://github.com/GoogleContainerTools/distroless/blob/mast...

souterrain · on May 8, 2019

Adding a shell seems antithetical to deploying production code as a static-linked binary, not to mention an expansion of the attack surface of the container.

hibbelig · on May 8, 2019

Without a shell, how does one debug if anything goes wrong?

hundt · on May 9, 2019

You can start a container with a shell that shares the PID and network namespaces of the container you want to debug.

antsar · on May 8, 2019

Reading logs/traces on your log aggregation service and reproducing in a dev system?

jeremyjh · on May 8, 2019

How do you debug in the dev env without a shell?

the8472 · on May 8, 2019

from the host system, containers don't exist in a vacuum

y04nn · on May 8, 2019

With remote debugging?

lstodd · on May 8, 2019

remote debugging is a shell

the8472 · on May 8, 2019

not necessarily. e.g. java runtimes can expose debugging ports when needed that operate on a custom protocol.

or you can just build gdb into the container and run the process under gdb, then attach to the tty.

or you can debug from the host system where the container's pid namespace is a descendant of the root namespace and the other namespaces can be accessed via /proc or unshare.

lstodd · on May 9, 2019

What I meant is having a remote debugger is as good as having a remote shell in terms of remote code execution.

discreteevent · on May 8, 2019

Debugging is about when the difference between theory and practice breaks down.

ecnahc515 · on May 8, 2019

You can use nsenter

sofaofthedamned · on May 8, 2019

Depends. Normally I would do with golang apps. But if I need to debug an issue I'll redeploy them with Ubuntu underneath so I can use the debugging tools.

stephenr · on May 9, 2019

If your binary is static, why do you need a container at all?

jpwgarrison · on May 9, 2019

So you can justify using k8s and shine up your resume!

stephenr · on May 9, 2019

It’s the new 3x SQLServer + 2x IIS + 2x SharePoint cluster to serve an intranet for 100 staff when a single <insert oss wiki on linux> would be more than sufficient.

Jonanin · on May 8, 2019

For Python I'd also highly recommend Clear Linux for raw performance. It's not quite as easy to get started as with something like Ubuntu though.

mixmastamyk · on May 8, 2019

Checked it out, is Intel supported with compiler tweaks to get the most out of their CPUs, for libs like math, pandas, etc.

svacko · on May 9, 2019

Clear Linux is definitely an interesting project, however, using base image with 'latest' tag (only tag existing for https://hub.docker.com/r/clearlinux/python/tags) in the production is not a best strategy as the breaking changes can arrive anytime

pella · on May 8, 2019

docker pull clearlinux/python

https://github.com/clearlinux/dockerfiles/tree/master/python

sprydevl · on May 9, 2019

If you want small images, why not use a tool like https://github.com/docker-slim/docker-slim Then it doesn’t matter which distro you favor?

GordonS · on May 9, 2019

Wow, I've never heard of this before - I'm looking forward to seeing how much this shrinks my images!

onlydeadheroes · on May 9, 2019

Search for "awesome-docker" and make sure you have some free time. :)

nurettin · on May 9, 2019

I don't quite get it. If you've somehow statically compiled all your dependencies, shouldn't that just run without a container?

Perhaps the point is not to enable program execution, but to make use of the benefits that may come with container orchestration.

spacenick88 · on May 9, 2019

Well at the very least running in a container gives you filesystem and network and PID space isolation, optionally also user namespace isolation.

chrisseaton · on May 8, 2019

> I’d think that a few megabytes of disk isn’t as valuable as the extra cpu cycles.

Depends on your workload, of course. Some people want to run a huge number of containers and each isn’t compute intensive.

Or maybe you don’t use libc at all in your fast path?

Lots of cases it makes sense.

icebraining · on May 8, 2019

If you reuse the base image, the libc files will be shared anyway.

jacques_chester · on May 8, 2019

This is the part most people don't realise.

They see Alpine at 5Mb and Ubuntu at 80Mb. They mentally multiply, without realising that each of these will be pulled once for each image built on top of them.

For a large cluster it's a wash. You might as well use Ubuntu, Centos -- anything where there are people working fulltime to fix CVEs quickly.

ShinTakuya · on May 9, 2019

As far as I'm aware you'd have to load that 80mb into memory for each docker container you run so that can add up if you want to run a bunch of containers on a cheap host with 1GB of RAM.

I do agree that people prematurely optimise and mainly incorrectly consider disk space but I think there's a decent use case for tiny images.

jacques_chester · on May 9, 2019

Not quite. The 80Mb represents the uncompressed on-disk size. Those bits can appear in memory in two main ways. It can be executable, in which case large parts will be shared (remember, containers are not like VMs). Or it can in the FS cache, in which case bits will be evicted as necessary to make room for executables.

There's a case for tiny images, but it's in severely-constrained environments. Otherwise folks are fetishising the wrong thing based on a misunderstanding of how container images and runtimes work.

sofaofthedamned · on May 8, 2019

I've no idea why you wouldn't use Ubuntu which is only around 40mb, has a sane package manager and a standard glibc.

guessmyname · on May 8, 2019

> Ubuntu which is only around 40mb […]

I just downloaded Ubuntu 18.04 and 19.04 and they are not 40MB:

    $ docker image ls | grep ubuntu
    ubuntu  19.04  f723e3b6f1bd   76.4MB
    ubuntu  18.04  d131e0fa2585  102.0MB
    ubuntu  16.04  a51debf7e1eb  116.0MB

How do you get a 40MB Ubuntu Docker image?

---

I followed @sofaofthedamned — https://blog.ubuntu.com/2018/07/09/minimal-ubuntu-released

But I’m still confused, where are they getting 29MB from? Even the compressed files are big:

    Ubuntu Bionic [1]
    - ubuntu-18.04-minimal-cloudimg-amd64-root.tar.xz |  77M
    - ubuntu-18.04-minimal-cloudimg-amd64.img         | 163M
    - ubuntu-18.04-minimal-cloudimg-amd64.squashfs    |  96M
    
    Ubuntu Cosmic [2]
    - ubuntu-18.10-minimal-cloudimg-amd64-root.tar.xz | 210M
    - ubuntu-18.10-minimal-cloudimg-amd64.img         | 295M
    - ubuntu-18.10-minimal-cloudimg-amd64.squashfs    | 229M
    
    Ubuntu Disco [3]
    - ubuntu-19.04-minimal-cloudimg-amd64-root.tar.xz |  69M
    - ubuntu-19.04-minimal-cloudimg-amd64.img         | 155M
    - ubuntu-19.04-minimal-cloudimg-amd64.squashfs    |  89M

[1] http://cloud-images.ubuntu.com/minimal/releases/bionic/relea...

[2] http://cloud-images.ubuntu.com/minimal/releases/cosmic/relea...

[3] http://cloud-images.ubuntu.com/minimal/releases/disco/releas...

sofaofthedamned · on May 8, 2019

https://blog.ubuntu.com/2018/07/09/minimal-ubuntu-released

Apologies, it's actually 29mb.

jacques_chester · on May 8, 2019

Don't feel bad. You've discovered the charming little fact that the registry API will report compressed size and the Docker daemon will report uncompressed size.

novaleaf · on May 8, 2019

i switched to ubuntu minimal for my cloud instances. it works great. highly recommended.

wmf · on May 8, 2019

Ubuntu used to be much larger so I wouldn't be surprised if people switched to Alpine and never looked back.

nine_k · on May 8, 2019

40mb vs 5mb is like 5x difference.

There are also slimmed down images based on Debian or Ubuntu. A number of packages is a bit older versions, though.

koolba · on May 8, 2019

You're only playing that 40mb once though. Multiple containers sharing the same parent layers will not require additional storage for the core OS layer.

outworlder · on May 8, 2019

Are you guys running everything on a single box?

Do you get all developers to agree on which base image to build all their services from?

I heard about this "oh, it's shared, don't worry" thing before. It started with 40MB. Now that supposedly shared image is half a gig. "Don't worry, it's shared anyway". Expect when it isn't. And when it is, it still slow us down in bringing up new nodes. And guess what, turns out that not everyone is starting from the same point, so there is a multitude of 'shared' images now.

Storage is cheap, but bandwidth may not be. And it still takes time to download. Try to keep your containers as small as possible for as long as possible. Your tech debt may grow slower that way.

jacques_chester · on May 8, 2019

As it happens, you're describing one of the motivations for Cloud Native Buildpacks[0]: consistent image layering leading to (very) efficient image updates.

Images built from dockerfiles can do this too, but it requires some degree of centralisation and control. Recently folks have done this with One Multibuild To Rule Them All.

By the time you're going to the trouble of reinventing buildpacks ... why not just use buildpacks? Let someone else worry about watching all the upstream dependencies, let someone else find and fix all the weird things that build systems can barf up, let someone else do all the heavy testing so you don't have to.

Disclosure: I worked on Cloud Native Buildpacks for a little while.

[0] https://buildpacks.io/

nine_k · on May 8, 2019

> single box

In production, the smallest box has half a gig of RAM.

In development, it's indeed a single box, usually a laptop.

> all developers to agree on which base image to build all their services from

Yes. In a small org it's easy. In a large org devops people will quickly explain the benefits of standardizing on one, at most two, base images. Special services that are run from a third-party image is a different beast.

> Storage is cheap, but bandwidth may not be.

Verily! Bandwidth, build time, etc.

sofaofthedamned · on May 8, 2019

Exactly. With copy on write and samepage merging it's more important to use the same base image for all your deployments.

makapuf · on May 8, 2019

Technically it's closer to 8. Which is quite a bit. That said even. If it's relatively very different, absolutely speaking its 40mb which is very little even if you have to transfer it up.

stephenr · on May 9, 2019

It's sad to me that it wasn't obvious to you 5*5 is not 40, or 40/5 is not 5.

lixtra · on May 8, 2019

In particular since the base image is shared among all containers using it. It's 35 mb extra for n containers (n>>1).

sofaofthedamned · on May 8, 2019

Exactly. Amazed more people don't know this.

akvadrako · on May 8, 2019

Ubuntu is 40MB but if you add a few packages with tons of dependencies it can quickly reach 800MB. Alpine has much more reasonable dependency trees.

voidfunc · on May 8, 2019

--no-install-recommends is your friend

NewJazz · on May 8, 2019

Many debian based images set this option by default.

sleepybrett · on May 9, 2019

And a shitty library at the heart that makes everything suck just a bit more.

0xbadcafebee · on May 8, 2019

The base image may be small, but all the packages and metadata are large, and the dependencies are many. Alpine always leans to conservative options and intentionally removes mostly-unnecessary things. So average image sizes are higher with an Ubuntu base compared to Alpine.

acct1771 · on May 8, 2019

My inability to personally audit systemd would be at the top of the list.

outworlder · on May 8, 2019

Why would anyone run an init system inside a container?

Just run the process. Let whatever is managing your container restart it if the process quits, be it docker or K8s.

voidfunc · on May 8, 2019

Because sometimes it is easier to ship one thing to customers than 10 different images and several different run manifests.

There's plenty of other reasons.

oliver_quartic · on May 8, 2019

Here is one reason: https://news.ycombinator.com/item?id=10852055 (TL;DR - default signal behaviour is different for PID 1).

[Edit - fix link]

Xylakant · on May 8, 2019

given that containers usually do not include any init system at all that’s not a good reason to pick a side.

zbentley · on May 8, 2019

Ubuntu Docker images do not run systemd by default or in most configurations. In many images it is removed entirely.

silluk · on May 8, 2019

I wouldn't be too concerned about that in a container since you're probably not running systemd in that context.

roywiggins · on May 8, 2019

As far as I can tell, the recommended way to run several processes in an Ubuntu container is under supervisord. The default Ubuntu containers don't even include an init system.

sofaofthedamned · on May 8, 2019

Suoervisord is an awful thing to do under a container tbf. Any oom event and it will kill a random process. Not good in a container environment.

roywiggins · on May 8, 2019

I've only ever done it when I needed to run exactly two processes- porting an older piece of software that was not originally designed to run inside containers, and orchestrating the separate processes to be able to communicate and run in separate containers didn't seem like worth the effort.

mixmastamyk · on May 8, 2019

And supervisord will restart it.

josteink · on May 8, 2019

I’ll try not to be opinionated, but starting an app inside Ubuntu typically has 50+ processes.

In most cases with Alpine-based containers, the only process is the one that you actually want to run.

Add to that that modern Ubuntu uses systemd which greatly exhausts the system’s inotify limits, so running 3-4 Ubuntu-containers can easily kill a systems ability to use inotify at all, across containers and the host system. Causing all kind of fun issues, I assure you.

So the cost is not just about disk-space.

Disclaimer: more experience with LXC than Docker.

wmf · on May 8, 2019

I don't think your LXC experience applies to Docker. Most Ubuntu-based Docker containers are not running full init.

mixmastamyk · on May 8, 2019

Their minimal images are well, minimal. Maybe not as minimal as Alpine, but not heavy either.

sosodev · on May 8, 2019

I guess the question is how much slower?

Jonanin · on May 8, 2019

Benchmarks here: https://www.phoronix.com/scan.php?page=article&item=docker-s...

For Python in particular, significantly slower.

nine_k · on May 8, 2019

The Python test is indeed a large outlier! I wonder how reproducible it is, and what the reason might be.

aw3c2 · on May 8, 2019

That is very likely due to processor/architecture optimized precompiled Python. See e.g. https://clearlinux.org/blogs/transparent-use-library-package... and https://clearlinux.org/blogs/boosting-python-profile-guided-...

blopker · on May 8, 2019

I did an analysis with the official Docker Python image [0] with the --enable-optimizations flag. Although not the point of the results [1], they show a decent speed up when using Debian vs Alpine as well.

[0] https://github.com/docker-library/python/issues/160 [1] https://gist.github.com/blopker/9fff37e67f14143d2757f9f2172c...

cs02rm0 · on May 8, 2019

Because it's tiny, I tend to default to Alpine and then move away from it where necessary. Rather than worrying about potential CPU performance requirements upfront - premature optimisation and all that.

mikepurvis · on May 8, 2019

Isn't there an argument that using Alpine instead of something like Ubuntu is premature optimization for space?

akvadrako · on May 8, 2019

It's faster in terms of development time. Faster to install packages and to push to docker repos.

icebraining · on May 8, 2019

Why? You should only need to push the base image once. Then only upper layers will have to be pushed.

emilfihlman · on May 8, 2019

Making it smaller than necessary is a bigger premature optimisation, though. Rather than worrying about potential disk space issues upfront just use glibc and everything works fast and fine.

nine_k · on May 8, 2019

If your software is heavily CPU-bound, and the difference matters for you, you can likely find or build a highly optimized image for your particular number-crunching task.

If your software is I/O-bound, and is written in a language like Python or Ruby, and sits idle waiting for requests for significant time anyway, CPU performance is likely not key for you. This also represents the majority case, AFAICT.

whatshisface · on May 8, 2019

Why is musl slower?

Nelson69 · on May 8, 2019

I don't want to disparage any project or guess the motivation but there have been some undercurrents of anti-GPL sentiment at times and anti-complexity. Folks have sort of backed it up by posting some links to things that aren't as you might think. GLIBC in particular looks nothing like how you might imagine it. You can look at strlen in the K&R book and it's beautiful, like a textbook: int strlen(char s[]) { int i; i = 0; while( s[i] != '\0') ++i return i; }

then you look at: https://sourceware.org/git/?p=glibc.git;a=blob;f=string/strl...

and your mind will sort of explode for a bit. The difference is the GLIBC version is dramatically faster; take the comments away and most of us wouldn't even know that's strlen. It's more complex, no question, but it's much faster. GLIBC is full of stuff like that. qsort and memcpy are non-obvious to many folks. It's not complexity for no reason, you'd be challenged to build a better qsort than the one in glibc, it's not easy.

johncolanduoni · on May 9, 2019

You'd have to use a pretty esoteric platform to get that version. For most platforms you'll have a handle-rolled assembly version that's dramatically faster than that one (the x86_64 one is here[1]).

[1]: https://github.com/bminor/glibc/blob/master/sysdeps/x86_64/s...

username223 · on May 8, 2019

This reminds me of a classic blog post:

    http://ridiculousfish.com/blog/posts/old-age-and-treachery.html

All of those old Unix programs aren't fast by being simple and clean on the inside. Old age and treachery...

globuous · on May 8, 2019

And this guy also https://news.ycombinator.com/item?id=14543536 (which comes with this: https://www.gnu.org/prep/standards/standards.html#Reading-No...)

saagarjha · on May 8, 2019

It's much simpler and more oriented towards POSIX compatibility than performance.

NewJazz · on May 8, 2019

* musl prioritizes thread safety and static linking over performance * most software is written and optimized against glibc, not musl

bayesian_horse · on May 9, 2019

The small size of an image can be an actual issue. Certainly for embedded devices, probably also for clusters.

For me it's mostly a non-issue. But things like redis or nginx work nicely with alpine as a base. And more than likely, whatever I want to containerize has already been containerized for Alpine. If not, getting something to work on Alpine may just not be worth it...

bogomipz · on May 8, 2019

>"It’s awesome that the images are pretty small, but a wide variety of software has been shown to run noticeably slower on Alpine compared to other distributions, in part due to its usage of musl instead of glibc"

Might you have any citation(s) for this? I have not heard this before. Could you elaborate on what some the "wide variety of software" is? Thanks.

creatornator · on May 9, 2019

I've heard that having shared libraries like glibc allows for commonly shared hotpaths to remain in the CPU cache, making it faster. Whereas in musl, since the binaries all have their own copies of procedures, they are more often kicked out of cache. I don't imagine it has a big impact on a container where you are only running one or two binaries. Perhaps in a desktop environment with hundreds or thousands of different programs running, it might make a difference.

dnautics · on May 9, 2019

It's not just a few megabytes, it's more like hundreds. If you're moving an image across a bad network (like the internet) that could be hundreds of milliseconds.

geezerjay · on May 9, 2019

I would also add that data transfers cost money, and having to transfer a few hundred MBs each time a container image is passed around can reflect in the expenses.

0xbadcafebee · on May 8, 2019

It's not a few megabytes, it's a few hundred to a thousand megabytes saved, on average. Multiplied by a thousand containers, and much larger layers on build servers, plus bandwidth, it makes a difference.

Worst case for slower processes, things take longer. Worst case for more disk use, things start crashing. For general cases, the former is preferable.

sofaofthedamned · on May 8, 2019

No, it's not. With samepage merging it's nothing, let alone docker only loading the image once.

atombender · on May 8, 2019

This assumes everyone is on exactly the same version. Larger organizations can afford to appoint SREs who can institute a broad range of security and optimization policies (including "all apps should use the same Alpine base image version") and enforce them programmatically as well as ensure that apps are continually updated to match them. That kind of thing is expensive, resource-wise, for smaller organizations.

0xbadcafebee · on May 9, 2019

I'm not talking about image waste, I'm talking a single box with a half dozen different base image versions and a slew of extra packages thrown in. Every app built against glibc is going to get bigger, and not all Ubuntu packages are built with small size in mind. Compare these systems with Ubuntu vs Alpine base images and the average size for an Ubuntu ecosystem is substantially larger.

polskibus · on May 8, 2019

What is the better and safer alternative for cases where some extra megabytes are ok?

Drdrdrq · on May 8, 2019

What exactly is meant by null? The null / zero character, no password or 4-char string "null"?

adrianmonk · on May 8, 2019

It's sloppy writing. They mean it is the empty string in /etc/shadow.

Following some links from the CVE, you can find the details (from https://talosintelligence.com/vulnerability_reports/TALOS-20...):

> In builds of the Alpine Docker Image (>=3.3) the /etc/shadow file contains a blank field in place of the encrypted password

> ...

> The net result of a blank sp_pwdp field is that the system will treat the root user as having no password, rather than a 'locked' account if a ! or is explicitly specified.*

For those not super familiar with how Unix/Linux password files work, if the field is non-empty, the system will collect a password, hash it, and compare for a match. If the field is empty, the system will just skip prompting for a password and log you in after entering a username.

Arguably, this is crappy design and a better design for /etc/shadow would be to require some kind of explicit, very obvious value like "NO-PASSWORD-REQUIRED".

jlgaddis · on May 9, 2019

> Arguably, this is crappy design and a better design for /etc/shadow would be to require some kind of explicit, very obvious value like "NO-PASSWORD-REQUIRED".

I think pretty much all Linux distributions use PAM nowadays (although I'm sure there are exceptions).

From pam_unix.so(8):

> The default action of this module is to not permit the user access to a service if their official password is blank.

Unfortunately, many distributions pass the "nullok" option to the pam_unix.so module, overriding this default behavior. Removing any instances of "nullok" from any files in /etc/pam.d/ is highly recommended -- unless you explicitly want this behavior (which could be a reasonable choice, in some specific use cases).

Drdrdrq · on May 8, 2019

Thanks!

RKearney · on May 8, 2019

NULL as in there is no password of any kind configured.

paulddraper · on May 8, 2019

The NUL character is not allowed in a Linux password.

Also the text of the CVE is more clear.

> contain a NULL password for the `root` user

It's no password.

lallysingh · on May 8, 2019

Note the conditional:

"systems deployed using affected versions of the Alpine Linux container which utilize Linux PAM, or some other mechanism which uses the system shadow file as an authentication database, may accept a NULL password for the `root` user"

So if you're not using a container running ssh/etc, it doesn't affect you.

Serious question: what's the use of PAM in a docker image that's vulnerable here?

antod · on May 9, 2019

I also note they could run head on /etc/shadow without sudo

> docker run -it alpine:3.$i head -n 1 /etc/shadow

The container process itself is running as the locally namespaced root account anyway.

From my possibly incorrect interpretation... unless you are trying to build containers like VMs with multiple users and init etc, this null root password issue seems irrelevant.

scandinavian · on May 8, 2019

For sshd you would have to set PermitEmptyPasswords to yes for it to matter, right?

raesene9 · on May 9, 2019

and also PermitRootLogin, AFAIK

paulddraper · on May 8, 2019

I highly recommend using user namespaces for Docker. [1]

It means the host is protected from privilege escalation in the container.

[1] https://docs.docker.com/engine/security/userns-remap/

shakna · on May 8, 2019

> Eight days after this vulnerability was initially fixed, a commit was pushed which removed this 'disable root by default' flag from the 'edge' build properties file, reintroducing this issue to subsequent builds.

That... Did not take long. From fix, to regression on a fairly serious security issue (though perhaps not critical, depending how the container infrastructure is set up).

> Unfortunately, later that same year, a commit was pushed to simplify the regression tests. This lead to logic that may have caught this regression being simplified, causing these tests to be incorrectly 'satisfied' if the root password was once again removed.

Testing is hard, and ensuring you meet the requirements but keeping it readable can be harder.

nine_k · on May 8, 2019

> Versions of the Official Alpine Linux Docker images (since v3.3) contain a NULL password for the `root` user. This vulnerability appears to be the result of a regression introduced in December of 2015.

Emphasis mine.

If this tells us something is that how inconsequential is that.

A lot of software already runs with uid 0 inside a container. So, if you got an RCE in that software, you've got a container root anyway. Containers normally do not expose anything else but the target daemon to the network, so cracking in through a non-privileged daemon is unlikely.

Container's root does not map to host's root, so any intrusion is most likely limited to the container if detected soon enough (though it's far from being bullet-proof).

cpuguy83 · on May 8, 2019

Container's root most certainly does map to host root unless you've enabled user namespaces (which is painful).

However, root in a container does have many privileges dropped and seccomp policies applied (assuming you haven't turned seccomp off via k8s).

shakna · on May 8, 2019

> The likelihood of exploitation of this vulnerability is environment-dependent, as successful exploitation requires that an exposed service or application utilise Linux PAM, or some other mechanism which uses the system shadow file as an authentication database.

It depends how you have things set up, and what the container is actually doing, and what it's linked to. So it may not be as critical as root on the server, but it may also allow some damage that is more than irritating to happen.

I would call it serious, but not critical.

zaroth · on May 8, 2019

Seriously, if an attacker is in a position to exploit this, hasn’t everything already failed?

Or to put another way, under what condition is this actually helpful defense in depth?

Nothing should be able to login locally to an Alpine container. All authentication should be public key based to begin with.

What would the root password be if not null?

detaro · on May 8, 2019

> What would the root password be if not null?

Deactivated, allowing no root login.

EDIT:

> Seriously, if an attacker is in a position to exploit this, hasn’t everything already failed?

Not necessarily I think. Something might support PAM auth (by accident or intentionally) and it wouldn't allow anything if no/only accounts with strong passwords can be logged into, but obviously breaks once root is available for login.

s14ve · on May 8, 2019

I believe it's public on their GitHub since 5 Aug 2018. https://github.com/gliderlabs/docker-alpine/issues/430

> 2019-03-01 - It was discovered that this issue was also reported and made public in their Github prior to our report, but was not flagged as a security issue and thus remained unresolved until it was rediscovered and reported by Cisco.

https://talosintelligence.com/vulnerability_reports/TALOS-20...

asveikau · on May 8, 2019

> https://github.com/gliderlabs/docker-alpine/issues/430

That issue is claimed to have been fixed, with a reference to a commit of the updated images, says issue 430 is a security issue and closed, but no link to the actual fix.

Word to the wise folks: If you are fixing bugs by posting binaries, it's a good idea to include a reference to the git hash of the actual fixes you've built those binaries with.

asdfe · on May 8, 2019

This is not the official Alpine image, it is Glider Labs one.

deathanatos · on May 8, 2019

https://github.com/docker-library/official-images/pull/5516

The official image was also affected, too, it appears.

silverwind · on May 8, 2019

Yeah, official alpine one looks fine

    $ docker run -it alpine head -1 /etc/shadow
    root:!::0:::::

LukeShu · on May 8, 2019

Going through alpine:3.1 to alpine:3.9, then alpine:edge, I see that the following versions have the problem: 3.3, 3.4, 3.5, and 3.8.

deathanatos · on May 8, 2019

3.8 looks fine?

  » docker run --rm -ti alpine:3.8 sh -c 'cat /etc/shadow | grep root'
  root:!::0:::::

The others you mention though, I agree, they look less than fine.

Also, has anyone reported this to the official Alpine repository? (since the Talos disclosure seems to be confused; it says official but has URLs to the Glider Labs version?)

Edit: Ah, so here's the relevant GitHub issue for official Alpine Linux docker: https://github.com/docker-library/official-images/pull/5516

<=3.5 is no longer supported. Everything newer is patched.

kamane · on May 9, 2019

What should be done to prevent NULL password then? Like just setting a custom strong password for root user in alpine container Dockerfile ? If yes, could you share the recipe how to correctly do it?

captn3m0 · on May 9, 2019

Update your base images

LukeShu · on May 9, 2019

You're right, they've patched 3.8; I was hitting an older cached image that I had laying around.

minimaxir · on May 8, 2019

Hmm? Per the Date Entry Created this was reported months ago (EDIT: I can't read), although the linked report (https://talosintelligence.com/vulnerability_reports/TALOS-20...) was posted today

Hackbraten · on May 8, 2019

> Disclaimer: The entry creation date may reflect when the CVE ID was allocated or reserved, and does not necessarily indicate when this vulnerability was discovered, shared with the affected vendor, publicly disclosed, or updated in CVE.

indigodaddy · on May 9, 2019

If your box gets rooted because of this then you already had glaring security issues.

foxhop · on May 8, 2019

I thought this was intentional when I stumbled on it a month ago.