Home » Vision Framework for Face Landmarks detection using Xamarin.iOS

Vision Framework for Face Landmarks detection using Xamarin.iOS

Mobile devices are getting better and better at solving sophisticated tasks. Not only because of better hardware, but also due to modern trends towards AI – such tasks as face detection, barcode recognition, rectangle detection, text recognition, etc. are now supported on the operating system level making it really simple to solve them in your app. Here I am going to show how to detect face landmarks in real time using the Vision framework. The demo app that we’re going to build here is also available on GitHub.

AVCaptureSession

The first thing to do is to configure an instance of AVCaptureSession to capture the video stream from the front camera. We’re going to direct the stream to

AVCaptureVideoPreviewLayer to preview it on the screen
AVCaptureVideoDataOutput to perform the face landmarks detection

Let’s start with a small helper property to get the front camera AVCaptureDevice. We’re using the AVCaptureDeviceDiscoverySession specifying that we’re interested in the front camera.

public AVCaptureDevice GetDevice()
{
    var videoDeviceDiscoverySession = AVCaptureDeviceDiscoverySession.Create(new AVCaptureDeviceType[] { AVCaptureDeviceType.BuiltInWideAngleCamera }, AVMediaType.Video, AVCaptureDevicePosition.Front);
    return videoDeviceDiscoverySession.Devices.FirstOrDefault();
}

Now the AVCaptureSession itself.

public void ConfigureDeviceAndStart()
{
    var device = GetDevice();
    if (device.LockForConfiguration(out var error))
    {
        if (device.IsFocusModeSupported(AVCaptureFocusMode.ContinuousAutoFocus))
        {
            device.FocusMode = AVCaptureFocusMode.ContinuousAutoFocus;
        }
        device.UnlockForConfiguration();
    }
    // Configure Input
    var input = AVCaptureDeviceInput.FromDevice(device, out var error2);
    _captureSession.AddInput(input);
        // Configure Output
    var settings = new AVVideoSettingsUncompressed()
    {
        PixelFormatType = CoreVideo.CVPixelFormatType.CV32BGRA
    };
    var videoOutput = new AVCaptureVideoDataOutput
    {
        WeakVideoSettings = settings.Dictionary,
        AlwaysDiscardsLateVideoFrames = true
    };
    var videoCaptureQueue = new DispatchQueue("Video Queue");
    videoOutput.SetSampleBufferDelegateQueue(new OutputRecorder(View, _shapeLayer), videoCaptureQueue);
    if (_captureSession.CanAddOutput(videoOutput))
    {
        _captureSession.AddOutput(videoOutput);
    }
    // Start session
    _captureSession.StartRunning();
}

Here we’re initiating the capture session by adding instances of the AVCaptureDeviceInput and AVCaptureVideoDataOutput classes. We’re setting AlwaysDiscardsLateVideoFrames to true to save some memory (well, it’s true by default, but let’s make it explicit). And also what’s important here is the OutputRecorder – our implementation of IAVCaptureVideoDataOutputSampleBufferDelegate which will do the face landmarks detection.

VNSequenceRequestHandler and VNDetectFaceLandmarksRequest

At this point, we have the configured AVCaptureSession and we’re ready to process the output to detect face landmarks. To do this let’s override the DidOutputSampleBuffer method.

public class OutputRecorder : AVCaptureVideoDataOutputSampleBufferDelegate
{
    public override void DidOutputSampleBuffer(AVCaptureOutput captureOutput, CMSampleBuffer sampleBuffer, AVCaptureConnection connection)
    {
        using (var pixelBuffer = sampleBuffer.GetImageBuffer())
        using (var ciImage = new CIImage(pixelBuffer))
        using (var imageWithOrientation = ciImage.CreateByApplyingOrientation(ImageIO.CGImagePropertyOrientation.LeftMirrored))
        {
            DetectFaceLandmarks(imageWithOrientation);
        }
        sampleBuffer.Dispose();
    }
    ...
}

The method is called every time there are new frames captured. We’re creating a CIImage and passing it to the DetectFaceLandmarks method which will use the Vision framework to detect face landmarks and draw on the overlay layer. Note that we need to properly dispose all objects, otherwise the app becomes unresponsive very quickly.

VNSequenceRequestHandler _sequenceRequestHandler = new VNSequenceRequestHandler();
VNDetectFaceLandmarksRequest _detectFaceLandmarksRequest;
void DetectFaceLandmarks(CIImage imageWithOrientation)
{
    if (_detectFaceLandmarksRequest == null)
    {
        _detectFaceLandmarksRequest = new VNDetectFaceLandmarksRequest((request, error) =>
        {
            RemoveSublayers(_shapeLayer);
            if (error != null)
            {
                throw new Exception(error.LocalizedDescription);
            }
            var results = request.GetResults<VNFaceObservation>();
            foreach (var result in results)
            {
                if (result.Landmarks == null)
                {
                    continue;
                }
                var boundingBox = result.BoundingBox;
                var scaledBoundingBox = Scale(boundingBox, _view.Bounds.Size);
                InvokeOnMainThread(() =>
                {
                    DrawLandmark(result.Landmarks.FaceContour, scaledBoundingBox, false, UIColor.White);
                    DrawLandmark(result.Landmarks.LeftEye, scaledBoundingBox, true, UIColor.Green);
                    DrawLandmark(result.Landmarks.RightEye, scaledBoundingBox, true, UIColor.Green);
                    DrawLandmark(result.Landmarks.Nose, scaledBoundingBox, true, UIColor.Blue);
                    DrawLandmark(result.Landmarks.NoseCrest, scaledBoundingBox, false, UIColor.Blue);
                    DrawLandmark(result.Landmarks.InnerLips, scaledBoundingBox, true, UIColor.Yellow);
                    DrawLandmark(result.Landmarks.OuterLips, scaledBoundingBox, true, UIColor.Yellow);
                    DrawLandmark(result.Landmarks.LeftEyebrow, scaledBoundingBox, false, UIColor.Blue);
                    DrawLandmark(result.Landmarks.RightEyebrow, scaledBoundingBox, false, UIColor.Blue);
                });
            }
        });
    }
    _sequenceRequestHandler.Perform(new[] { _detectFaceLandmarksRequest }, imageWithOrientation, out var requestHandlerError);
    if (requestHandlerError != null)
    {
        throw new Exception(requestHandlerError.LocalizedDescription);
    }
}

The method is quite simple. First, we initiate a new VNDetectFaceLandmarksRequest by specifying a handler which will iterate through all results and draw them (note that we’re doing the drawing on the UI thread). And second, we’re using the VNDetectFaceLandmarksRequest to perform the detection on the CIImage from the previous step.
And lastly, the DrawLandmark method:

void DrawLandmark(VNFaceLandmarkRegion2D feature, CGRect scaledBoundingBox, bool closed, UIColor color)
{
    var mappedPoints = feature.NormalizedPoints.Select(o => new CGPoint(x: o.X * scaledBoundingBox.Width + scaledBoundingBox.X, y: o.Y * scaledBoundingBox.Height + scaledBoundingBox.Y));
    using (var newLayer = new CAShapeLayer())
    {
        newLayer.Frame = _view.Frame;
        newLayer.StrokeColor = color.CGColor;
        newLayer.LineWidth = 2;
        newLayer.FillColor = UIColor.Clear.CGColor;
        using (UIBezierPath path = new UIBezierPath())
        {
            path.MoveTo(mappedPoints.First());
            foreach (var point in mappedPoints.Skip(1))
            {
                path.AddLineTo(point);
            }
            if (closed)
            {
                path.AddLineTo(mappedPoints.First());
            }
            newLayer.Path = path.CGPath;
        }
        _shapeLayer.AddSublayer(newLayer);
    }
}

Since the Vision framework returns normalized points of landmarks we’re transforming them to the screen coordinates before drawing. The rest code is just about adding a new CAShapeLayer with the drawn line.

Conclusion

Here I showed you how simple it is to perform such a complex task as the detection of facial landmarks. If you’re creating your own app that uses this feature, don’t forget to add an NSCameraUsageDescription to your info.plist. Also, keep in mind that the Vision framework is available on iOS 11+. Happy coding!

Valeriy Kovalenko

Our Gear Is Packed and We're Excited to Explore With You

Ready to come with us?

Together, we can map your company’s software journey and start down the right trails. If you’re set to take the first step, simply fill out our contact form. We’ll be in touch quickly – and you’ll have a partner who is ready to help your company take the next step on its software journey.

We can’t wait to hear from you!

Main Contact

Together, we can map your company’s tech journey and start down the trails. If you’re set to take the first step, simply fill out the form below. We’ll be in touch – and you’ll have a partner who cares about you and your company.

We can’t wait to hear from you!

Vision Framework for Face Landmarks detection using Xamarin.iOS

AVCaptureSession

VNSequenceRequestHandler and VNDetectFaceLandmarksRequest

Conclusion

Valeriy Kovalenko

Related Blog Posts

Testing the Bluetooth Features of Mobile Apps

Selecting a Software Composition Analysis Tool

Scanning Uploaded Files for Malware in C#

Our Gear Is Packed and We're Excited to Explore With You

Main Contact

Montage Portal

Project Background

Logistics

Custom App and Software Development

Cloud and Mobile Applications

User Experience and Interface (UX/UI) Design

DevOps

Technologies Used

Expertise

Our Gear Is Packed and We're Excited to Explore with You

Thank you for reaching out.