mruby-encoding

This mrbgem provides a lightweight, "poorman's" encoding functionality for mruby. It is designed to offer basic encoding support, primarily focused on UTF-8 and ASCII-8BIT.

Summary

License: MIT
Author: mruby developers
Supported Encodings:
- Encoding::UTF_8
- Encoding::ASCII_8BIT (aliased as Encoding::BINARY)

Functionality

This gem introduces an Encoding module and extends the String and Integer classes with encoding-related methods.

`Encoding` Module

A module (not a class, unlike standard Ruby) that holds encoding constants.

Encoding::UTF_8: Represents the UTF-8 encoding.
Encoding::ASCII_8BIT: Represents the ASCII-8BIT encoding.
Encoding::BINARY: An alias for Encoding::ASCII_8BIT.

`String` Methods

string.valid_encoding? -> true or false
- Returns true if the string is correctly encoded (particularly useful for UTF-8 strings). For ASCII-8BIT strings, it generally returns true.
string.encoding -> EncodingConstant
- Returns the encoding of the string. This will be Encoding::UTF_8 or Encoding::BINARY.
string.force_encoding(encoding_name) -> string
- Changes the string's reported encoding to the specified encoding_name (e.g., "UTF-8", "ASCII-8BIT", "BINARY").
- The actual byte sequence of the string is not changed.
- Raises an ArgumentError if an unsupported encoding name is provided.

`Integer` Method

integer.chr(encoding_name = Encoding::BINARY) -> String
- Returns a single-character string represented by the integer.
- If encoding_name is "UTF-8", the integer is treated as a Unicode codepoint.
- If encoding_name is "ASCII-8BIT" or "BINARY" (the default), the integer is treated as a byte value (0-255).
- Raises a RangeError if the integer is out of the valid range for the specified encoding.
- Raises an ArgumentError for unknown encoding names.

Usage Example

# main.rb
if __ENCODING__ == "UTF-8"
  s = "helloあ"
  puts s.encoding  #=> Encoding::UTF_8
  puts s.valid_encoding? #=> true

  s2 = "\xff".force_encoding("UTF-8")
  puts s2.valid_encoding? #=> false

  s3 = "world"
  s3.force_encoding("BINARY")
  puts s3.encoding #=> Encoding::BINARY
  puts s3.valid_encoding? #=> true (ASCII-8BIT strings are generally considered valid)

  puts 65.chr #=> "A" (defaults to ASCII-8BIT)
  puts 230.chr("UTF-8") #=> "æ" (if U+00E6 is æ)
  # For mruby, this might be different based on actual UTF-8 char mapping
  # For example, 12354.chr("UTF-8") might be "あ"
else
  s = "hello"
  puts s.encoding #=> Encoding::BINARY (or ASCII-8BIT)

  # Attempting to force to UTF-8 in a non-UTF-8 mruby build might be limited
  # or behave as ASCII-8BIT depending on mruby's core string handling.
end

# Force encoding
my_string = "\xE3\x81\x82" # UTF-8 bytes for "あ"
puts my_string.encoding # Might be BINARY by default if not created as UTF-8 literal

my_string.force_encoding("UTF-8")
puts my_string.encoding #=> Encoding::UTF_8
puts my_string #=> あ

invalid_utf8 = "\xff\xfe"
invalid_utf8.force_encoding("UTF-8")
puts invalid_utf8.valid_encoding? #=> false

# Integer#chr
puts 65.chr # => "A"
puts 65.chr("BINARY") # => "A"

# When mruby is compiled with MRB_UTF8_STRING
if Object.const_defined?(:MRB_UTF8_STRING)
  puts 12354.chr("UTF-8") # => "あ"
  # puts 0x110000.chr("UTF-8") #=> RangeError
end

Name		Name	Last commit message	Last commit date
parent directory ..
src		src
test		test
README.md		README.md
mrbgem.rake		mrbgem.rake

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

mruby-encoding

Summary

Functionality

`Encoding` Module

`String` Methods

`Integer` Method

Usage Example

FilesExpand file tree

mruby-encoding

Directory actions

More options

Directory actions

More options

Latest commit

History

mruby-encoding

Folders and files

parent directory

README.md

mruby-encoding

Summary

Functionality

Encoding Module

String Methods

Integer Method

Usage Example

`Encoding` Module

`String` Methods

`Integer` Method