CLC number: TP391.41
On-line Access:
Received: 2006-06-01
Revision Accepted: 2006-10-10
Crosschecked: 0000-00-00
Cited: 6
Clicked: 5578
JIANG Ren-jie, QI Fei-hu, XU Li, WU Guo-rong, ZHU Kai-hua. A learning-based method to detect and segment text from scene images[J]. Journal of Zhejiang University Science A, 2007, 8(4): 568-574.
@article{title="A learning-based method to detect and segment text from scene images",
author="JIANG Ren-jie, QI Fei-hu, XU Li, WU Guo-rong, ZHU Kai-hua",
journal="Journal of Zhejiang University Science A",
volume="8",
number="4",
pages="568-574",
year="2007",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.2007.A0568"
}
%0 Journal Article
%T A learning-based method to detect and segment text from scene images
%A JIANG Ren-jie
%A QI Fei-hu
%A XU Li
%A WU Guo-rong
%A ZHU Kai-hua
%J Journal of Zhejiang University SCIENCE A
%V 8
%N 4
%P 568-574
%@ 1673-565X
%D 2007
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.2007.A0568
TY - JOUR
T1 - A learning-based method to detect and segment text from scene images
A1 - JIANG Ren-jie
A1 - QI Fei-hu
A1 - XU Li
A1 - WU Guo-rong
A1 - ZHU Kai-hua
J0 - Journal of Zhejiang University Science A
VL - 8
IS - 4
SP - 568
EP - 574
%@ 1673-565X
Y1 - 2007
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.2007.A0568
Abstract: This paper proposes a learning-based method for text detection and text segmentation in natural scene images. First, the input image is decomposed into multiple connected-components (CCs) by Niblack clustering algorithm. Then all the CCs including text CCs and non-text CCs are verified on their text features by a 2-stage classification module, where most non-text CCs are discarded by an attentional cascade classifier and remaining CCs are further verified by an SVM. All the accepted CCs are output to result in text only binary image. Experiments with many images in different scenes showed satisfactory performance of our proposed method.
[1] Chen, D., Shearer, K., Bourlard, H., 2001. Text Enhancement with Symmetric Alter for Video OCR. Proc. International Conference on Image Analysis and Recognition, p.192-197.
[2] Chun, B.T., Bae, Y., Kim, T.Y., 1999. Automatic Text Extraction in Digital Videos Using FFT and Neural Network. Proc. IEEE International Fuzzy Systems Conference. Seoul, Korea, 2:1112-1115.
[3] Clark, P., Mirmehdi, M., 2000. Finding Text Regions Using Localized Measures. Proc. 11th British Machine Vision Conference, p.675-684.
[4] Ekin, A., 2006. Local Information Based Overlaid Text Detection by Classifier Fusion. Proc. International Conference on Acoustics, Speech and Signal Processing, 2:753-756.
[5] Kim, K.I., Jung, K., Kim, J.H., 2003. Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm. IEEE Trans. Pattern Anal. Machine Intell., 25(12):1631-1639.
[6] Kim, K.C., Byun, H.R., Song, Y.J., Choi, Y.W., Chi, S.Y., Kim, K.K., Chung, Y.K., 2004. Scene Text Extraction in Natural Scene Images Using Hierarchical Feature Combining and Verification. Proc. International Conference on Computer Vision and Pattern Recognition, 2:679-682.
[7] Liu, C., Wang, C., Dai, R., 2005. Text Detection in Images Based on Unsupervised Classification of Edge-based Features. Proc. International Conference on Document Analysis and Recognition.
[8] Liu, C.L., Koga, M., Fujisawa, H., 2005. Gabor Feature Extraction for Character Recognition: Comparison with Gradient Feature. Proc. 8th International Conference on Document Analysis and Recognition, 1:121-125.
[9] Lyu, M.R., Song, J., Cai, M., 2005. A comprehensive method for multilingual video text detection, localization, and extraction. IEEE Trans. Circuits Syst. Video Technol., 15(2):243-255.
[10] Mao, W., Chung, F., Lanm, K., Siu, W., 2002. Hybrid Chinese/English Text Detection in Images and Video Frames. Proc. International Conference on Computer Vision and Pattern Recognition, 3:1015-1018.
[11] Qian, X., Liu, G., 2006. Text Detection, Localization and Segmentation in Compressed Videos. Proc. International Conference on Acoustics, Speech and Signal Processing, 2:385-388.
[12] Takahashi, H., Nakajima, M., 2005. Region Graph Based Text Extraction from Outdoor Images. Proc. 3rd International Conference on Information Technology and Applications, 1:680-685.
[13] Wang, K.Q., Kangas, J.A., 2003. Character location in scene images from digital camera. Pattern Recognition, 36(10):2287-2299.
[14] Weinman, J., Hanson, A., McCallum, A., 2004. Sign Detection in Natural Images with Conditional Random Fields. Proc. IEEE International Workshop on Machine Learning for Signal Processing. Brazil, p.549-558.
[15] Winger, L., Robinson, J.A., Jernigan, M.E., 2000. Low-complexity character extraction in low-contrast scene images. IEEE Trans. Pattern Recog. Artif. Intell., 14(2):113-135.
[16] Zhang, D.Q., Chang, F.H., 2004. Learning to Detect Scene Text Using a Higher-Order MRF with Belief Propagation. Proc. International Conference on Computer Vision and Pattern Recognition, p.101-107.
[17] Zhu, K., Qi, F., Jiang, R., Xu, L., 2005. Using Adaboost to Detect and Segment Characters from Natural Scenes. Proc. Conference on Camera Based Document Analysis and Recognition, p.52-59.
Open peer comments: Debate/Discuss/Question/Opinion
<1>