C++ notes
To prevent SIGPIPE while debugging client/server use (gdb) handle SIGPIPE nostop
GENERAL
- defined functions in a aclass declaration (in a header file) implicitly defined as inline i.e. their body will be injected everywhere function call appears
- static const integral class members can be declared wo definition (value set directly in place where it is declared), in this case compiler substitutes all the occurences of this memeber with actual value (does not require memory allocation)
- nested classes are friends to the outer class
- custom new (with args) must be always accompanied with delete with the same list of args. Otherwise C++ runtime can not undo new if ctr throws an exception
- applying delete never calls “placement delete”
- zero length arrays are not legal in C++
- ellipsis function — functoin that accepts everything:
foo(...)
- local classes are final
- the compiler does not compile not used functions it only checks their syntax
- if prevents CPU pipelines i.e. affects performance
C++ Core Guidelines
- for cheap-copying objects (primitives, string_view, iterators, small objects ~16B) prefer pass by value
- prefer retrun value to output parameters
TEMPLATE META PROGRAMMING
- the expression being evaluated is a compile-time constant, which means that you can have the compile instead of runtime code check. The idea is to pass the compiler a language construct that is legal for a non-zero expression (aka
sizeof(A) <= sizeof(B)
) and illegal for an expression that evaluates for zero - assume
template<bool> struct CompileTimeError; template<> struct CompileTimeError<true>{};if you try to instantiate
CompileTimeError<false>
the compiler alerts a message “Undefined specialization” - if the compiler sees a specialized template it uses its implementation everywhere such specialization is appliable. Otherwise — a generic implementation
- partial specialization is not appliable to functions (before C++17). But it is appliable to non member functions within a namespace, but only for function args (not RV as it is kind of overloading)
- specialized templates in C++ are differnet types. Hence
Type<true>
andType<false>
are different. This makes it possible to use compile time dispatching - virtual functions can not be templated
TEMPLATE TYPE DEDUCTION
- Suppose signature:
foo(T&);
It is safe to pass const param as deducted type will be const ParamType&, i.e. ifconst int x = 123;
thenfoo(const int& x)
- Universal reference (forwarding reference), i.e.
foo(T&&)
can be deducted to be either lvalue reference or rvalue reference depending on the actual type of the parameter passed - Suppose
foo(T)
, where T is a ParamType; actuall parameter will be always passed by value, i.e. copied - to constrain templates. i.e. disable them for compiler if some terms apply, one can use (in templates expressions)
std::enable_if; std::is_same; std::is_base_of; std::is_integral; std::is_constructable
etc. See Item 27
OBJECT INITIALIZATION
- initialization with curle braces
Foo f{}
prefers ctr with std::initializer_list if exists {}
initilazer can not be perfect forwarded:new Foo(std::forward<Ts>(params)...)
auto foo = {10, 20}
— creates std::initializer_list
LVALUE&RVALUE
- lvalue – persistent value; rvalue – temporary value, e.g.
foo(new Bar)
– here the result of thenew
is a rvalue - lvalue are passed by value, i.e. copying (slow); rvalue — by move (fast)
- std::move converts lvalue into rvalue
- to overload method with universal reference as an argument one can use tag dispatching technique. See Effective Modren C++ Item 27
TAG DISPATCHING
A keystone of tag dispatch is the existence of a single (unoverloaded) function as the client API. This single function dispatches the work to be done to the implementation functions.
SMART POINTERS
- shared pointer is two times bigger than a raw pointer (Widget + control block). Control block may be quite big if custom deleter is used
- make_shared performs single memory allocation (Widget + control block)
- using
new
instead of calling make_shared[uninque] may lead to resource leak: processWidget(shared(new Bar), doSmth()) -> new Bar; doSmth [- throws exception]; shared ptr ctr [is not executed] - do not try to pass shared_ptr by reference: prefer pointer to the object or const ref to the object.
- pass shared_ptr by value if called function changes ownership
- control block contains weak count (in addition to ref count and other stuff) and thus can not be deleted while weak cont GT 0
- when pimpl idiom is applied and unique_ptr is used ctr; copy and move operations must be declared in the header and defined in the source file (simple Widget::~Widget() = default; will do)
MOVING AND COPYING
- std::move for rvalues; std::forward for universal references
- std::move used in function call
(foo(std::move(smth)))
forces compiler to think about a local variable as about temporary unnamed parameter to a function - RVO – return value optimization, i.e. if 1) function creates a local object and 2) returns this object (has the same return type) this object maybe placed into return value placeholder on the stack, i.e. no copying occurs on return statement
- perfect forwarding fails in the following cases: type deduction fails or impossible; {} initialization; 0, NULL pointer overloading; forwarding overloaded functions; forwarding bitfields
CONCURRENCY
- std::async default launch policy does not guarantee that the task will be executed concurrently nor it will be executed (it will be executed only if
get
orwait
is called on the resulting future=vv v) - volatile in C++ means special memory (memory mapped IO, for instance) and disallows optimizations on this memory, e.g. reodering, deleting dead stores etc
CUDA
- device function limitations:
- no address
- no recursion
- no state variables
- no vararg
- variable specification limits:
- no extern
- constant write only through special functions on CPU
- shared may not be initialized in declaration
- data types on GPU: 1/2/3/4-dimensional vectros of (u)char, (u)int, (u)short, (u)long, long long, float and double
- starting from GPU 8 (nvidia) GPU supports natively bit operations
- max length 128 bit
- double[][] – OK; double[][][] – NOK
- dim3 – uint_3 with ctr(not initialized = 1), i.e.
dim3(5) == [5,1,1]
- factory for types:
make_{type}
e.g.make_int2(1,7)
- kernel has access to:
dim3 gridDim; uint3 blockIdx; dim3 blockDim; uint3 threadIdx; int warpSize;
- CUDA by default initiales everything to 1
- how to run kernel with total nx threads:
float* data;//array ptr dim3 threads(256);//threads N dim3 blocks(nx/256);//how many blocks //defines a set of 10 blocks with length = 256 int kernel <<< blocks, threads>>>(data);
GAME DESIGN
- prefer event driven architecture in any case
- split logic and application and view layers. Where veiw represents logic state changes to users (this can be also AI); application layer speaks with the hardware; logic accumulates game state
- low-end video cards tend to have DirectX drivers
- OGG format is OpenSource; also FMod — quite capable framework for playing sounds
- standard OS mem manager is not efficient
- different order of accessing n-dim arrays may affect performance. It depends on how arrays are stored in RAM
- CPU reads and writes mem aligned data much faster i.e. int stored at 0x04 much better than at 0x0400…0002.
32%4=0
– 8-byte boundary - always align data memebers so its performance will increase for free
- do not let compiler ad alignment. This causes enormous memory waste. Use #pragma pack() to ensure manual alignment