Epstein Files Full PDF

CLICK HERE
Technopedia Center
PMB University Brochure
Faculty of Engineering and Computer Science
S1 Informatics S1 Information Systems S1 Information Technology S1 Computer Engineering S1 Electrical Engineering S1 Civil Engineering

faculty of Economics and Business
S1 Management S1 Accountancy

Faculty of Letters and Educational Sciences
S1 English literature S1 English language education S1 Mathematics education S1 Sports Education
teknopedia

  • Registerasi
  • Brosur UTI
  • Kip Scholarship Information
  • Performance
Flag Counter
  1. World Encyclopedia
  2. Text processing - Wikipedia
Text processing - Wikipedia
From Wikipedia, the free encyclopedia
Creating or manipulating electronic text
This article is about Computer processing. For mental processing, see Reading comprehension. For language processing by computers, see Natural language processing.

In computing, the term text processing refers to the theory and practice of automating the creation or manipulation of electronic text. Text usually refers to all the alphanumeric characters specified on the keyboard of the person engaging the practice, but in general text means the abstraction layer immediately above the standard character encoding of the target text. The term processing refers to automated (or mechanized) processing, as opposed to the same manipulation done manually.

Text processing involves computer commands which invoke content, content changes, and cursor movement, for example to

  • search and replace
  • format
  • generate a processed report of the content of, or
  • filter a file or report of a text file.

The text processing of a regular expression is a virtual editing machine, having a primitive programming language that has named registers (identifiers), and named positions in the sequence of characters comprising the text. Using these, the "text processor" can, for example, mark a region of text, and then move it. The text processing of a utility is a filter program, or filter. These two mechanisms comprise text processing.

Definition

[edit]

Since the standardized markup such as ANSI escape codes are generally invisible to the editor, they comprise a set of transitory properties that become at times indistinguishable from word processing. But the definite distinctions from word processing are that text processing proper:

  • represents "text processing utilities", not just "text editing" applications.
  • is much more "the keyboard way", as opposed to "the mouse way" (e.g. drag and drop, cut and paste) of initiating an edit.
  • is sequential access rather than random access in approach.
  • operates directly at the presentation layer rather than indirectly at the application layer.
  • works raw data that is standardized and works more openly rather than tending towards any proprietary methods.

In this way markup such as font and color are not really a distinguishing factor, because the character sequences that affect font and color are simply standard characters inserted automatically by a background text processing mode, made to work transparently by compliant text editors, yet becoming otherwise visible as text processing commands when that mode is not in effect. So text processing is defined most basically (but not entirely) around the visual characters (or graphemes) rather than the standard, yet invisible characters.

History

[edit]

The development of computer text processing started in earnest with Kleene's formalizing what is a regular language. Such regular expressions could then become a mini-program, complete with a compilation process, available to perform any edit, once that language was extended. Similarly, filters are extended by evolving particular options.

Basic concepts

[edit]

An editor essentially invokes an input stream and directs it to the text processing environment, which is either a command shell or a text editor. The resulting output is applicable to further text processing, the final result of which is comparable to a single application of an algorithm applied once by a more sophisticated and structured computer program.

Text processing is, unlike an algorithm, a manually administered sequence of simpler macros that are the pattern-action expressions and filtering mechanisms. In either case the programmer's intention is impressed indirectly upon a given set of textual characters in the act of text processing. The results of a text processing step are sometimes only hopeful, and the attempted mechanism is often subject to multiple drafts through visual feedback, until the regular expression or markup language details, or until the utility options, are fully mastered.

Text processing is concerned mostly with producing textual characters at the highest level of computing, where its activities are just below the practical uses of computing—the manual transmission of information.

Ultimately all computing is text processing, from the self-compiling textual characters of an assembler, through the automated programming language generated to handle a blob of graphical data, and finally to the metacharacters of regular expressions which groom existing text documents.

Text processing is its own automation.

Characters

[edit]

Textual characters come in standardized character sets containing also control characters such as newline characters, which arrange text. Other types of control characters arrange the transmission, define the character sets, and perform other housekeeping tasks.

See also

[edit]
  • Text editor
  • List of Unix commands

External links

[edit]
  • The subject matter of the book Automatic Text Processing by Gerard Salton
  • Database with Text Processing Tools Archived 2021-03-05 at the Wayback Machine (2013-10-23)
  • Content analysis software Software for Content Analysis.
  • Text Tools Online Online Text processing tools.
  • v
  • t
  • e
Natural language processing
General terms
  • AI-complete
  • Bag-of-words
  • n-gram
    • Bigram
    • Trigram
  • Computational linguistics
  • Natural language understanding
  • Stop words
  • Text processing
Text analysis
  • Argument mining
  • Collocation extraction
  • Concept mining
  • Coreference resolution
  • Deep linguistic processing
  • Distant reading
  • Information extraction
  • Named-entity recognition
  • Ontology learning
  • Parsing
    • semantic
    • syntactic
  • Part-of-speech tagging
  • Semantic analysis
  • Semantic role labeling
  • Semantic decomposition
  • Semantic similarity
  • Sentiment analysis
  • Terminology extraction
  • Text mining
  • Textual entailment
  • Truecasing
  • Word-sense disambiguation
  • Word-sense induction
Text segmentation
  • Compound-term processing
  • Lemmatisation
  • Lexical analysis
  • Text chunking
  • Stemming
  • Sentence segmentation
  • Word segmentation
Automatic summarization
  • Multi-document summarization
  • Sentence extraction
  • Text simplification
Machine translation
  • Computer-assisted
  • Example-based
  • Rule-based
  • Statistical
  • Transfer-based
  • Neural
Distributional semantics models
  • BERT
  • Document-term matrix
  • Explicit semantic analysis
  • fastText
  • GloVe
  • Language model
    • large
    • small
  • Latent semantic analysis
  • Long short-term memory
  • Seq2seq
  • Transformer
  • Word embedding
  • Word2vec
Language resources,
datasets and corpora
Types and
standards
  • Corpus linguistics
  • Lexical resource
  • Linguistic Linked Open Data
  • Machine-readable dictionary
  • Parallel text
  • PropBank
  • Semantic network
  • Simple Knowledge Organization System
  • Speech corpus
  • Text corpus
  • Thesaurus (information retrieval)
  • Treebank
  • Universal Dependencies
Data
  • BabelNet
  • Bank of English
  • DBpedia
  • FrameNet
  • Google Ngram Viewer
  • UBY
  • WordNet
  • Wikidata
Automatic identification
and data capture
  • Speech recognition
  • Speech segmentation
  • Speech synthesis
  • Natural language generation
  • Optical character recognition
Topic model
  • Document classification
  • Latent Dirichlet allocation
  • Pachinko allocation
Computer-assisted
reviewing
  • Automated essay scoring
  • Concordancer
  • Grammar checker
  • Predictive text
  • Pronunciation assessment
  • Spell checker
Natural language
user interface
  • Chatbot
  • Interactive fiction
  • Question answering
  • Virtual assistant
  • Voice user interface
Related
  • Formal semantics
  • Hallucination
  • Natural Language Toolkit
  • spaCy
Authority control databases Edit this at Wikidata
International
  • GND
National
  • United States
  • Israel
Other
  • Yale LUX
Retrieved from "https://teknopedia.ac.id/w/index.php?title=Text_processing&oldid=1308781111"
Categories:
  • Text
  • Unix text processing utilities
Hidden categories:
  • Articles with short description
  • Short description matches Wikidata
  • Webarchive template wayback links

  • indonesia
  • Polski
  • العربية
  • Deutsch
  • English
  • Español
  • Français
  • Italiano
  • مصرى
  • Nederlands
  • 日本語
  • Português
  • Sinugboanong Binisaya
  • Svenska
  • Українська
  • Tiếng Việt
  • Winaray
  • 中文
  • Русский
Sunting pranala
url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url url
Pusat Layanan

UNIVERSITAS TEKNOKRAT INDONESIA | ASEAN's Best Private University
Jl. ZA. Pagar Alam No.9 -11, Labuhan Ratu, Kec. Kedaton, Kota Bandar Lampung, Lampung 35132
Phone: (0721) 702022
Email: pmb@teknokrat.ac.id