在之前的一篇博客里我们使用 Metal 和 Metal Shading Language 实时地给视频添加了一个特效,它的实现是只处理当前这一帧的画面,帧与帧之间没有关联,因此实现起来非常简单。但是还有很多效果的实现是需要将当前帧画面与前一帧(或前几帧)的画面混合,例如 ShaderToy 上的这个效果:Diffusion Experiment 1。看一下 Buf A 这一栏的代码:

vec4 cell(vec2 fragCoord, vec2 pixel)
{
	vec2 uv = (fragCoord-pixel) / iResolution.xy;
    return texture(iChannel0, uv);
}

void mainImage( out vec4 fragColor, in vec2 fragCoord )
{
    vec2 uv = fragCoord / iResolution.xy;
    
    if (iFrame < 5) // init buffer with iChannel2 texture
    {
        fragColor = texture(iChannel1, uv);
    }
    else
    {
        // get adjacents cells from backbuffer 
        vec4 l = cell(fragCoord, vec2(-1,0)); // left cell
        vec4 r = cell(fragCoord, vec2(1,0)); // rigt cell
        vec4 t = cell(fragCoord, vec2(0,1)); // top cell
        vec4 b = cell(fragCoord, vec2(0,-1)); // bottom cell
        
        // get current cell from backbuffer
        vec4 c = cell(fragCoord, vec2(0,0)); // central cell
        
        // quad dist from cells
        fragColor = max(c, max(l,max(r,max(t,b))));
        
        // video merge
        if (iMouse.z < .1)
         	fragColor = fragColor * .95 + texture(iChannel1, uv) * .05;
	}
}

容易看出,它的方案是使用一个 buffer 来保存上一帧的处理结果,在下一帧,取出上一帧处理结果中 fragCoord 点周围的颜色值并与当前帧 fragCoord 点处的颜色值混合。明白了这点,将它转换为 Metal 代码就很简单了:

kernel void diffusion(texture2d<float, access::read> inTexture [[ texture(0) ]],
                      texture2d<float, access::write> outTexture [[ texture(1) ]],
                      texture2d<float, access::read> lastTexture [[ texture(2) ]],
                      device const float *time [[ buffer(0) ]],
                      uint2 gid [[ thread_position_in_grid ]])
{
    float4 l = lastTexture.read(gid - uint2(-1, 0));
    float4 r = lastTexture.read(gid - uint2(1, 0));
    float4 t = lastTexture.read(gid - uint2(0, 1));
    float4 b = lastTexture.read(gid - uint2(0, -1));
    float4 c = lastTexture.read(gid);
    
    float4 m = max(c, max(l, max(r, max(t, b))));
    float4 result = m * 0.95 + inTexture.read(gid) * 0.05;
    outTexture.write(result, gid);
}

这里,为了处理简单,我省略了前 5 帧直接返回原始 texture 的处理。相应的,代码也需要做出适当改变,首先修改初始化 MTLComputePipelineState 的函数:

private func initializeComputePipeline() {
    let library = device.makeDefaultLibrary()
    let shader = library?.makeFunction(name: "diffusion")
    computePipelineState = try! device.makeComputePipelineState(function: shader!)
}

然后,在 ViewController.swift 的开头添加一个用于保存上次处理结果的 MTLTexture 变量:

......

var sourceTexture: MTLTexture?
var lastTexture: MTLTexture?
var textureCache: CVMetalTextureCache?

......

接着,修改 draw(in:) 方法,初始化 lastTexture 变量,并设置为 diffusion 函数的 texture(2) 参数

func draw(in view: MTKView) {
    guard let currentDrawable = mtkView.currentDrawable, let texture = sourceTexture else {
        return
    }
    
    if lastTexture == nil {
        lastTexture = texture.makeTextureView(pixelFormat: texture.pixelFormat)
    }
    
    let commandBuffer = commandQueue.makeCommandBuffer()
    let computeCommandEncoder = commandBuffer?.makeComputeCommandEncoder()
    computeCommandEncoder?.setComputePipelineState(computePipelineState!)
    computeCommandEncoder?.setTexture(texture, index: 0)
    computeCommandEncoder?.setTexture(currentDrawable.texture, index: 1)
    computeCommandEncoder?.setTexture(lastTexture, index: 2)
    
    ......
    
}

最后,当 shader 函数执行完毕,将结果保存到 lastTexture 变量:

......

lastTexture = currentDrawable.texture
        
commandBuffer?.present(currentDrawable)
commandBuffer?.commit()

......

编译运行,看看效果(国内需要梯子):

从上面的修改步骤可以看出,为了切换一个特效渲染效果,我们需要修改多个地方的代码,这样的代码在实际使用中肯定是不合格的。其实仔细想一想就知道,我们的需求很简单:从 captureOutput(_:, didOutput:, from:) 方法得到原始 CMSampleBuffer(对视频来说也就是 CVPixelBuffer)处理成添加了效果的视频图像。这样考虑的话就不难抽象出这么一种 shader 函数原型:输入原始图像转化成的 MTLTexture 纹理,输出处理后的 MTLTexture 纹理。

在 iOS 9 的时候,苹果给 Metal 引入了一个叫 Metal Performance Shaders(简称 MPS) 的框架,它实现了一些通用的图像处理函数,iOS 10 的时候又引入了用于机器学习的卷积神经网络(CNN)、矩阵乘法等函数,性能高,优化好。不过我们暂时不关注这些,来看一下它的接口设计。

对于 kernel 函数,它提供了一个 MPSKernel 类,它是 MPS 框架其他类的基类,基于这个类又派生出了 MPSUnaryImageKernelMPSBinaryImageKernel 两个类。这三个类奠定了 MPS 框架的基础,基于他们的派生类只需要定义特定的初始化方法和一个 encode 方法,使用的时候就非常简单了,可以参考我简单翻译的这个步骤。也可以看苹果提供的官方 Demo。比如说 MPSImageGaussianBlur 可以这样使用(忽略初始化 MTLDevice 等部分):

1. 初始化:let gaussian = MPSImageGaussianBlur(device: device, sigma: 20.0)
2. 调用 encode 方法:gaussian.encode(commandBuffer: commandBuffer,
                    sourceTexture: sourceTexture,
                    destinationTexture: destinationTexture)

由上面的分析,我们可以模仿 MPS 框架定义下面这样一个协议:

protocol ShaderProtocol {
    func encode(commandBuffer: MTLCommandBuffer, sourceTexture: MTLTexture, destinationTexture: MTLTexture)
}

为了简化,这里我没有定义一个类似 init(device:) 这样的初始化方法。接着,可以定义一个 SeparateRGB 的类并遵循 ShaderProtocol 协议:

class SeparateRGB: ShaderProtocol {
    private var computePipelineState: MTLComputePipelineState?
    
    init() {
        computePipelineState = MetalManager.shared.makeComputePipelineState(functionName: "separateRGB")
    }
    
    func encode(commandBuffer: MTLCommandBuffer, sourceTexture: MTLTexture, destinationTexture: MTLTexture) {
        guard let cps = computePipelineState else {
            return
        }
        
        let computeCommandEncoder = commandBuffer.makeComputeCommandEncoder()
        computeCommandEncoder?.setComputePipelineState(cps)
        computeCommandEncoder?.setTexture(sourceTexture, index: 0)
        computeCommandEncoder?.setTexture(destinationTexture, index: 1)
        
        var diff = Float(CACurrentMediaTime() - MetalManager.shared.beginTime)
        computeCommandEncoder?.setBytes(&diff, length: MemoryLayout<Float>.size, index: 0)
        computeCommandEncoder?.dispatchThreadgroups(sourceTexture.threadGroups(pipeline: cps),
                                                    threadsPerThreadgroup: sourceTexture.threadGroupCount(pipeline: cps))
        computeCommandEncoder?.endEncoding()
    }
}

这里有一个 MetalManager 类,它主要用来管理一些通用变量(比如 device/sourceTexture/destinationTexture)以及将 CVPixelBuffer 转化成 MTLTexture:

class MetalManager: NSObject {
    static let shared: MetalManager = MetalManager()
    
    let device = MTLCreateSystemDefaultDevice()!
    var sourceTexture: MTLTexture?
    var destinationTexture: MTLTexture?
    var colorPixelFormat: MTLPixelFormat = .bgra8Unorm
    
    private(set) var beginTime = CACurrentMediaTime()
    var time: Float = 0
    
    private var library: MTLLibrary?
    private(set) var commandQueue: MTLCommandQueue?
    private var textureCache: CVMetalTextureCache?
    
    private override init() {
        super.init()
        
        library = device.makeDefaultLibrary()
        commandQueue = device.makeCommandQueue()
        CVMetalTextureCacheCreate(kCFAllocatorDefault, nil, device, nil, &textureCache)
    }
    
    // MARK: - Public
    
    func makeComputePipelineState(functionName: String) -> MTLComputePipelineState? {
        guard let function = library?.makeFunction(name: functionName) else {
            return nil
        }
        
        return try? device.makeComputePipelineState(function: function)
    }
    
    func processNext(pixelBuffer: CVPixelBuffer) {
        guard let tc = textureCache else {
            return
        }
        
        var cvmTexture: CVMetalTexture?
        let width = CVPixelBufferGetWidth(pixelBuffer)
        let height = CVPixelBufferGetHeight(pixelBuffer)
        CVMetalTextureCacheCreateTextureFromImage(kCFAllocatorDefault,
                                                  tc,
                                                  pixelBuffer,
                                                  nil,
                                                  colorPixelFormat,
                                                  width,
                                                  height,
                                                  0,
                                                  &cvmTexture)
        if let cvmTexture = cvmTexture, let texture = CVMetalTextureGetTexture(cvmTexture) {
            sourceTexture = texture
        }
    }
}

接下来就可以将 ViewController.swift 里 Metal 的相关代码移除并定义一个 SeparateRGB 实例:

private let separateRGB = SeparateRGB()

修改 captureOutput(_:, didOutput:, from:) 方法的实现,将 CVPixelBuffer 转化为 MTLTexture:

func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
    guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else {
        return
    }
     
    MetalManager.shared.processNext(pixelBuffer: pixelBuffer)
    lastSampleTime = CMSampleBufferGetPresentationTimeStamp(sampleBuffer)
    
    DispatchQueue.main.sync {
        mtkView.draw()
    }
}

接下来,修改 draw(in:) 的实现:

func draw(in view: MTKView) {
    guard let currentDrawable = mtkView.currentDrawable,
         let texture = MetalManager.shared.sourceTexture,
        let commandBuffer = MetalManager.shared.commandQueue?.makeCommandBuffer() else {
        return
    }
    
    separateRGB.encode(commandBuffer: commandBuffer, sourceTexture: texture, destinationTexture: currentDrawable.texture)
    append(texture: currentDrawable.texture)
    
    commandBuffer.present(currentDrawable)
    commandBuffer.commit()
}

到这里,我们成功的将 Metal 处理相关的代码从 ViewController 里分离了出来,只需要调用 separateRGB 的 encode 方法,而不必关心它的内部实现。

类似地,可以将 diffusion 处理函数定义成 Diffusion 类:

class Diffusion: ShaderProtocol {
    private var computePipelineState: MTLComputePipelineState?
    private var lastTexture: MTLTexture?
    
    init() {
        computePipelineState = MetalManager.shared.makeComputePipelineState(functionName: "diffusion")
    }
    
    func encode(commandBuffer: MTLCommandBuffer, sourceTexture: MTLTexture, destinationTexture: MTLTexture) {
        guard let cps = computePipelineState else {
            return
        }
        
        if lastTexture == nil {
            lastTexture = sourceTexture.makeTextureView(pixelFormat: sourceTexture.pixelFormat)
        }
        
        let computeCommandEncoder = commandBuffer.makeComputeCommandEncoder()
        computeCommandEncoder?.setComputePipelineState(cps)
        computeCommandEncoder?.setTexture(sourceTexture, index: 0)
        computeCommandEncoder?.setTexture(destinationTexture, index: 1)
        computeCommandEncoder?.setTexture(lastTexture, index: 2)
        
        var diff = Float(CACurrentMediaTime() - MetalManager.shared.beginTime)
        computeCommandEncoder?.setBytes(&diff, length: MemoryLayout<Float>.size, index: 0)
        computeCommandEncoder?.dispatchThreadgroups(sourceTexture.threadGroups(pipeline: cps),
                                                    threadsPerThreadgroup: sourceTexture.threadGroupCount(pipeline: cps))
        computeCommandEncoder?.endEncoding()
        
        lastTexture = destinationTexture
    }
}

打开 Main.storyboard,添加一个 UISwitch 到 ViewController 的视图上并连接,同时增加一个 Diffusion 实例:

@IBOutlet weak var shaderSwitch: UISwitch!

private let diffusion = Diffusion()

最后,修改 draw(in:) 方法:

......

let shader: ShaderProtocol
if shaderSwitch.isOn {
    shader = separateRGB
} else {
    shader = diffusion
}
        
shader.encode(commandBuffer: commandBuffer, sourceTexture: texture, destinationTexture: currentDrawable.texture)
        
......

编译运行,此时你就可以通过开关来切换特效了:)


详细代码可以到这里获取:)

参考