Every day, we encoun...

نتيجة التلخيص (50%)

Every day, we encounter a large number of images from various sources such as the internet, news
articles, document diagrams and advertisements.The main aim of this paper is to provide a comprehensive survey of deep learning for image
captioning.In traditional machine learning, hand crafted features such as Local Binary Patterns (LBP) [107],
Scale-Invariant Feature Transform (SIFT) [87], the Histogram of Oriented Gradients (HOG) [27],
and a combination of such features are widely used.Image indexing is important for Content-Based Image Retrieval (CBIR) and therefore,
it can be applied to many areas, including biomedicine, commerce, the military, education, digital
libraries, and web searching.For example,
Convolutional Neural Networks (CNN) [79] are widely used for feature learning, and a classifier
such as Softmax is used for classification.CNN is generally followed by Recurrent Neural Networks
(RNN) in order to generate captions.These survey papers mainly discussed template based, retrieval
based, and a very few deep learning-based novel image caption generating models.Social media platforms such as Facebook and Twitter can directly
generate descriptions from images.Generating well-formed sentences requires both syntactic and semantic understanding
of the language [143].Since hand crafted features are task specific, extracting features from
a large and diverse set of data is not feasible.On the other hand, in deep machine learning based techniques, features are learned automatically
from training data and they can handle a large and diverse set of images and videos.Although the papers have presented a good literature survey of
image captioning, they could only cover a few papers on deep learning because the bulk of them was
published after the survey papers.To provide an abridged version of the literature, we present a survey mainly focusing
on the deep learning-based papers on image captioning.

النص الأصلي

Every day, we encounter a large number of images from various sources such as the internet, news
articles, document diagrams and advertisements. These sources contain images that viewers would
have to interpret themselves. Most images do not have a description, but the human can largely
understand them without their detailed captions. However, machine needs to interpret some form
of image captions if humans need automatic image captions from it.
Image captioning is important for many reasons. For example, they can be used for automatic image indexing. Image indexing is important for Content-Based Image Retrieval (CBIR) and therefore,
it can be applied to many areas, including biomedicine, commerce, the military, education, digital
libraries, and web searching. Social media platforms such as Facebook and Twitter can directly
generate descriptions from images. The descriptions can include where we are (e.g., beach, cafe),
what we wear and importantly what we are doing there.
Image captioning is a popular research area of Artificial Intelligence (AI) that deals with image
understanding and a language description for that image. Image understanding needs to detect and
recognize objects. It also needs to understand scene type or location, object properties and their
interactions. Generating well-formed sentences requires both syntactic and semantic understanding
of the language [143].
Understanding an image largely depends on obtaining image features. The techniques used for
this purpose can be broadly divided into two categories: (1) Traditional machine learning based
techniques and (2) Deep machine learning based techniques.
In traditional machine learning, hand crafted features such as Local Binary Patterns (LBP) [107],
Scale-Invariant Feature Transform (SIFT) [87], the Histogram of Oriented Gradients (HOG) [27],
and a combination of such features are widely used. In these techniques, features are extracted
from input data. They are then passed to a classifier such as Support Vector Machines (SVM) [17]
in order to classify an object. Since hand crafted features are task specific, extracting features from
a large and diverse set of data is not feasible. Moreover, real world data such as images and video
are complex and have different semantic interpretations.
On the other hand, in deep machine learning based techniques, features are learned automatically
from training data and they can handle a large and diverse set of images and videos. For example,
Convolutional Neural Networks (CNN) [79] are widely used for feature learning, and a classifier
such as Softmax is used for classification. CNN is generally followed by Recurrent Neural Networks
(RNN) in order to generate captions.
In the last 5 years, a large number of articles have been published on image captioning with deep
machine learning being popularly used. Deep learning algorithms can handle complexities and
challenges of image captioning quite well. So far, only three survey papers [8, 13, 75] have been
published on this research topic. Although the papers have presented a good literature survey of
image captioning, they could only cover a few papers on deep learning because the bulk of them was
published after the survey papers. These survey papers mainly discussed template based, retrieval
based, and a very few deep learning-based novel image caption generating models. However, a
large number of works have been done on deep learning-based image captioning. Moreover, the
availability of large and new datasets has made the learning-based image captioning an interesting
research area. To provide an abridged version of the literature, we present a survey mainly focusing
on the deep learning-based papers on image captioning.
The main aim of this paper is to provide a comprehensive survey of deep learning for image
captioning. First,

تلخيص النصوص العربية والإنجليزية أونلاين

تلخيص النصوص آلياً

تلخيص النصوص العربية والإنجليزية اليا باستخدام الخوارزميات الإحصائية وترتيب وأهمية الجمل في النص

تحميل التلخيص

يمكنك تحميل ناتج التلخيص بأكثر من صيغة متوفرة مثل PDF أو ملفات Word أو حتي نصوص عادية

رابط دائم

يمكنك مشاركة رابط التلخيص بسهولة حيث يحتفظ الموقع بالتلخيص لإمكانية الإطلاع عليه في أي وقت ومن أي جهاز ماعدا الملخصات الخاصة

مميزات أخري

نعمل علي العديد من الإضافات والمميزات لتسهيل عملية التلخيص وتحسينها

آخر التلخيصات

ظهر السلاف الشر...

ظهر السلاف الشرقيون بوصفهم مجموعة معترفاً بها في أوروبا بين القرنين الثالث والثامن الميلاديين. وفي ا...

Not having the ...

Not having the right capabilities for your site: Some people envision having a graphics-rich, intera...

أيضاحاً . وتكمن...

أيضاحاً . وتكمن قوة هذه الفنون ( الجناس - التضاد السجع ) في كون معظم الناس يتعاملون معها على انها م...

يعتبر الاعتماد ...

يعتبر الاعتماد الإيجاري العقاري من العقود حديثة النشأة التي تلعب دورا كبيرا في تمويل المشروعات الاقت...

تعتمد ممارسه إد...

تعتمد ممارسه إدارة الارباح على العديد من المداخل؛ وكل مدخل يتضمن العديد من الأساليب التي يمكن أن تست...

عن طريق إتباع م...

عن طريق إتباع مجموعة من اإلجراءات و المراحل و التي تكمن في إعداد تنفيذ التصريح و إرفاق ملف مع التصر...

Communicative L...

Communicative Language Testing Communicative language testing is intended to provide the teacher wit...

الروضة هي مؤسسة...

الروضة هي مؤسسة تعليمية تهتم برعاية وتنمية الأطفال في سن ما قبل المدرسة وتوفير بيئة تعليمية مناسبة ت...

Traffic Enginee...

Traffic Engineering Vehicular characteristics: The design of geometric elements of a highway is affe...

اولا : فرديناد ...

اولا : فرديناد دي سوسير: ولد سوسير عام (1857) في جنيف والتحق بجامعتها عام (1875) ليختص في دراسة الفز...

OCR: استخراج ال...

OCR: استخراج النصوص: أولاً : المتغيرات السياسية فى الحقبة التاريخية (1952-1922م) وأثرها على التعليم ...

8.2 Evaluation ...

8.2 Evaluation of the cleaning process reveal the mechanism For research purposes, the measurements ...