Most times you're wanting to parse text, you're going to do three steps.
1) Turn characters into tokens
2) Turn tokens into an abstract syntax tree
3) Use the abstract syntax tree do do some kind of processing
The lexer is responsible for the first step, and it's *only* job is to create a "token stream" from text. It is not responsible for understanding the semantics of your language, it is only interested in understanding the syntax of your language.
For example, syntax is the rule that an identifier must only use characters, numbers and underscores - as long as it doesn't start with a number. The responsibility of the lexer is to understand this rule. In this case, the lexer would accept the sequence of characters "asd_123" but reject the characters "12dsadsa" (assuming that there isn't another rule in which this text is valid). When seeing the valid text example, it may emit a token into the token stream such as IDENTIFIER(asd_123).
Note that I said "identifier" which is the general term for things like variable names, function names, namespace names, etc. The parser would be the thing that would understand the context in which that identifier appears, so that it would then further specify that token as being a certain thing's name.
(sidenote: the token is just a unique name given to an element of the token stream. The lexeme is the text that the token was matched from. I write the lexeme in parentheses next to the token. For example, NUMBER(123). In this case, this is a NUMBER token with a lexeme of '123'. However, with some tokens, such as operators, I omit the lexeme since it's redundant. For example, I would write SEMICOLON for the semicolon token, not SEMICOLON( ; )).
When the lexer is done, it will output a token stream (assuming that there were not any errors). It would take a string such as:
And turn it into a token stream of:
Code:
KEYWORD_VAR IDENTIFIER(test) OPERATOR_ASSIGNMENT NUMBER(1) OPERATOR_ADDITION NUMBER(2) SEMICOLON
Notice that in this example, I'm ignoring whitespace in the lexer (as in, the lexer just reads whitespace and discards it).
What is important here is that the token stream is FLAT and has no structure or hierarchy. It is literally just an array of tokens.
Semantics are the meaning of your language. The job of understanding this is the parser. The parser is responsible for creating structure from the "flat" token stream. Almost always, a parser will emit an abstract syntax tree. Remember how the lexer has no concept of the structure or meaning of your language, whereas your parser does. Taking our example from earlier, the parser would extract meaning from our statement, such as understanding that this is an ASSIGNMENT node between a NEW_VARIABLE and an EXPRESSION and the EXPRESSION node is an ADDITION node between NUMBER(1) and NUMBER(2) nodes.
For example, the tree would look something like:
Code:
(assignment ((newVariable IDENTIFIER(test)) (expression (add NUMBER(1) NUMBER(2)))))
Take a close look at the shift in language. With the lexer, we only spoke of simple TOKENS that described what things are, but with the parser we speak in terms of what things mean. You would never use a lexer to understand the concept of addition; yet you would never use a parser to understand the concept of what defines a number, or a string literal, or a semicolon.
So we've introduced, in the parser, the concept of an assignment, newVariable, expression and add nodes. These aren't tokens, these are nodes in our abstract syntax tree. Antlr sometimes makes this a little more complicated, since tokens can be used as nodes - but nodes cannot be used as tokens. Notice in this example how I am differentiating tokens from nodes - tokens are uppercase whereas nodes are not. However, they're both used in the AST.
The parser is also responsible for enforcing the semantic rules of our language. For example, imagine that every statement must end with a semicolon - or that the block "invalid(true) { alert("HI"); }" is invalid, since the IDENTIFIER(invalid) token is not a language keyword such as if, while, for, etc.
You may have multiple passes of a parser, which simplify the tree. Perhaps you do some optimization with processing, too: such as convert a literal expression "1 + 1" into an equivalent node of "2". Then, all the way at the end, you simply navigate your tree and do whatever it is your language does: such as emit bytecode, populate a data structure, etc.
To answer your question: if you need to figure out what something syntactically is, then use a lexer rule. If you need to extract meaning from a sequence of already-lexed tokens, use a parser rule. Parsers are intelligent, lexers are dumb. Also, because of that fact, many things that are possible with parsers that are impossible with lexers. Does that make sense?