Live Pic
Open, NormalPublic

Description

Live Pic is an algorithm in the domain of Image and Video Processing for processing facial expressions in images and videos. The algorithm is targeting social media as well as industrial environment such as Film Industry. The purpose of this algorithm is to make a new platform where people play with their images and videos by swapping facial expressions and share them on social media or save them. The core functionality of the algorithm is to extract expressions from a face in an image and to apply those expressions on the face in a video so the person in the video give similar expressions as the person exists in image.
People can place their facial expressions from an image to a video in which an actor/actress or anybody is talking. The user can then analyze how much his expressions matches with the person talking in the video. User can also place expressions of the person which is dead to analyze how he/she would be looking while talking. The algorithm will also empower the user to change a little bit the position of facial parts such as, if somebody has taken an image and the eyes are a little bit close, the algorithm will help him to open is eyes a little bit.
In the industrial environment, the proposed algorithm will help newbie actors/actress to place his/her facial expressions to a professional actors face while practicing. This will help them judge the difference between their expressions and the expressions of a professional actor, instead of watching the video or the person again and again and practicing him, as this is a really time consuming and may be annoying thing. The newbie actor can then improve his expressions by judging himself on one video as many times as he wants. This algorithm is a next wave in Image and Video Processing world as there is no such algorithm exists except CGI while people are yet unfamiliar with CGI and this is the technique used in Film Industry. People do not know how to use it or they do not know about CGI technique.

Related Objects

StatusAssignedTask
OpenNone
Openmehrikhan36
sebastian updated the task description. (Show Details)
sebastian raised the priority of this task from to Normal.
sebastian assigned this task to mehrikhan36.
sebastian added a subscriber: sebastian.

@mehrikhan36 is this your GSoC project proposal?

Sorry for delay, yes @sebastian this is my GSoC project proposal.

Sorry for delay, yes @sebastian this is my GSoC project proposal.

Thanks, very interesting!

Can you elaborate more on the technical background of this proposaed feature and planned implementation?

RexOr added a subscriber: RexOr.Mar 24 2017, 11:06 PM

Thanks, very interesting!

Can you elaborate more on the technical background of this proposed feature and planned implementation?

Technically, the proposed feature is not as simple as it sounds because while we are taking the CGI (Computer Generated Imagery) under the discussion, CGI works with more then one images (Or may be too many images of a person) from which the expressions are being extracted and applied to someone's face. (Movie Fast & Furious 7 is an example of CGI Link)
While the proposed feature (We assume that) will work with one image. As the input data is just one image, there raises a constraint that, the face in the input image and in the destination video must be straight and in front of the camera. for better results, we need to have more then one images in different positions of the face. Now, if we talk about the IDE, it depends on what is actually needed, if we are targeting it on the level of algorithm or an application or we are embedding this feature in the camera. For algorithm or an application, MATLAB or OpenCV will work for it (OpenCV will be better as it takes less time to process the results compare to the MATLAB).

Implementation have the following key modules to be considered,

  • Face Detection
  • Feature Point Extraction
  • Pattern Matching

the deep down details have the correlation with any desired filter such as Sobel, Prewitt or Canny etc... to get the edges, this will give us some expressions of the face, meanwhile, we can eliminate the unwanted edges by averaging the image. this will help the system to take less time on processing (Or at run time).

Hope, these details can convince you about what i have proposed :)

Any feedback on the suggested project will be appreciated :)

RexOr added a comment.Mar 26 2017, 9:00 PM

@merikhan36 - Do you have any FPGA related experience?

In T765#11343, @RexOr wrote:

@merikhan36 - Do you have any FPGA related experience?

I did use FPGA a little bit but don't have experience with FPGA.

My very direct and subjective feedback:

Considering this has already been done: http://www.disclose.tv/action/viewvideo/224045/realtime_face_capture_and_reenactment_so_what_do_you_think_the_cia_can_do/ and the complexity is very high I am not convinced this is a feasible project for just a 3 month period and only one person. Also its something largely unrelated to the camera technology itself (the researchers in the mentioned link used off the shelf webcams). Integrating the entire process into the camera on the other hand would make it camera related again but is unrealistic considering the available resources (dual ARM core). It would require sophisticated acceleration in the cameras internal FPGA but that seems out of scope for this proposal.

So my suggestion would be to consider focusing on the groundwork first. It could be implementing CPU based pattern/feature recognition/tracking (maybe with OpenCV) inside the AXIOM Beta. This would require a downscalled video stream to be avaialble to the userspace though (which is not the case yet) and the CPus will probably still not be able to do image analysis in real time. @Bertl might be able to estimate the performance to expect better. But for tracking applications it might be enough to get results with every couple of frames.